














                                             DECtalk PC


                                    TEXT-TO-SPEECH SYSTEM

                             TECHNICAL REFERENCE  MANUAL










































                                          1
































































                                          2

















                                             DECtalk PC



                                        TEXT-TO-SPEECH SYSTEM

                                 TECHNICAL REFERENCE MANUAL








































                                          3











          1st Edition,  January 1992

          Copyright  1992 by Digital Equipment Corporation.
          All Rights Reserved.

          Printed in U.S.A.
          The reproduction of this material, in part or whole, is strictly
          prohibited. For copy information, contact the Assistive
          Technology Group, Digital Equipment Corporation, Northboro,
          Massachusetts  01532

          The information in this document is subject to change without
          notice. Digital Equipment Corporation assumes no responsibility
          for any errors that may appear in this document.
          This equipment generates, uses, and may emit radio frequency
          energy. The equipment has been type tested and found to comply
          with the limits for a Class A computing device pursuant to Part
          15 of FCC Rules, which are designed to provide  reasonable
          protection against such radio frequency interference  when
          operated in a commercial environment. Operation of this  equipment
          in a residential area may cause interference, in which  case the
          user, at his own expense, will be required to take  measures to
          correct the interference.

          Touch-Tone and AT&T are trademarks of American Telephone and
          Telegraph Company.
          PC is a trademark of International Business Machines, Inc.

          The following are trademarks of Digital Equipment Corporation,
          Maynard, Massachusetts.
                       DECtalk      Rainbow

          DECUS   RSTS
          DECmailer       DECwriter       RSX

          DECmate DIBOL   UNIBUS
          DECnet  MASSBUS VAX

          DECservice      PDP     VMS
          DECsystem-10    P/OS    VT

          DECSYSTEM-20    Professional    Work Processor











                                          4










          FCC COMPLIANCE
          This equipment generates and uses radio frequency energy. It has
          been type tested and found to comply with the limits for a Class
          A computing device in accordance with the specifications in Part
          15 of FCC Rules, which are designed to provide reasonable
          protection against such radio frequency interference.

          This equipment generates, uses and can emit radio frequency
          energy and, if not installed and used in accordance with the
          instructions, may cause harmful interference to radio communications.
          However, there  is no guarantee that interference will not occur
          in a particular installation. If this equipment does cause
          interference to radio or television reception, the user is
          encouraged to try to correct the interference by one or more of
          the following measures:
            o Reorient or relocate the receiving antenna

            o Increase the separation between the equipment and receiver.
           o Connect the equipment into an outlet on a circuit different
          from that to which the recever is         connected

            o Consult the dealer or an experienced radio/television
          technician for additional suggestions.
          The user may find the booklet How to Identify and Resolve
          Radio/TV Interference Problems prepared by the Federal
          Communications Commission helpful. The booklet is available from
          the U.S. Government Printing Office, Washington, DC  20402.
          Stock No. 004-000-00398-5




          CAUTION - ELECTRIC SHOCK AND FIRE HAZARD:

          In order to minimize the risk of a fire, this module is
          constructed  from a UL Recognized Component Printed Circuit
          Board. However, it  is not to be operated as a stand-alone
          product. It is to be mounted  within a UL Listed, CSA Certified
          product that provides it with a full fire enclosure.
           In order to minimize the risk of electric shock, this module is
          to be powered from a power supply output that is provided with
          Overload Protection and supplies Safety Extra Low Voltage (<60
          Vdc or 30 Vrms).

          Note: All Personal Computers sold by Digital Equipment
          Corporation meet these requirements.








                                          5








                                         TABLE OF CONTENTS



          INTRODUCTION
          CHAPTER 1:              Getting to Know the System

          CHAPTER 2:              How DECtalk Speech Works
          CHAPTER 3:              How To Communicate with DECtalk

          CHAPTER 4:              Text Processing
          CHAPTER 5:              Phonemics and Voices

          CHAPTER 6:              Modifying the Voices
          CHAPTER 7:              Developing an Advanced Application

          APPENDIX A:     Configuration for DECtalk Board
          APPENDIX B:     DECtalk Phonemic Symbols

          APPENDIX C:     Homographs
          APPENDIX D:     Voice Parameters

          INDEX

































                                          6










                                            INTRODUCTION







          ABOUT THE DECtalk PC
          Video terminals display information from a computer on a screen.
          Printers display the same information on paper. These devices
          allow you to communicate with computers through the sense of
          sight.  The DECtalktm PC  is another device that allows you to
          communicate  with computers. However, this device speaks information
          in  English. It allows you to communicate through the sense of
          hearing (the user) and speaking (DECtalk PC).

          The DECtalk PC  is a text-to-speech system on a single board
          designed to plug in to an IBM-PCtm  or PC-compatible.  It
          converts computer  text to computer speech with  the latest in
          DECtalk software. This system can provide any PC with a high-
          quality synthesized voice.
          ABOUT THIS MANUAL

          This manual is intended for software developers who intend to
          write application programs for the DECtalk PC board. Typical
          readers will have some experience with the IBM-PC and
          PC-compatibles and may need a reference book on using the
          DECtalk option board for PCs. Many such applications will be
          written for screen-readers for visually-impaired individuals.
          This manual may be used in conjunction with  the  DECtalk PC
          Installation Guide   (AA-PHHPA-TH).
          Chapters 1 and 2 provide a general  description of the DECtalk
          PC and how it works.  Chapter 3 describes how to use and program
          the DECtalk PC DTC07 to work with your IBM-PC  or PC compatible.
          This section also explains the command  sequences you can  use
          with the DECtalk PC.   Chapters  4-7 explain the linguistics and
          speech technology capabilities of the DECtalk PC.

          The appendices at the back of the manual provide additional
          information to  assist the reader to use and program the DECtalk
          PC. They also explain how to have the board serviced should the
          need arise.











                                          7








                  General Description

                  o       Chapter 1, "Getting to Know the System,"
          describes the
                          DECtalk PC DTC07 system and its modules. This
          chapter                                 also  provides general
          operating and testing information.
                  o       Chapter 2, "How DECtalk Speech Works," describes
          the                             DECtalk speech generating system
          and gives an overview of                         how the system
          operates.

                  o       Chapter 3,  How to Communicate with DECtalk
          describes                               how you can communicate
          with  the DECtalk PC     board via
          the device driver.
                  o       Chapter 4, Text Processing, describes how
          DECtalk processes                       text and adopts various
          word spell-out strategies.

                  o       Chapter 5,  DECtalk  Phonemics and Voices,
          describes the                           sound system of English.
          This chapter shows how to modify
          DECtalk's pronunciation to produce high-quality speech.
                  o       Chapter 6, "Modifying the Voices," shows how to
          change
                          the voices provided by DECtalk and how to create
          a new                             voice. This chapter includes
          commands to change the  speak                           ing rate
          and make certain other modifications to the voices.

                  o       Chapter 7, "Developing Advanced Applications,"
          describes                               some techniques for
          writing applications. This  chapter also
          gives programming          and operating hints to  optimize
          DECtalk                 performance.
                  Appendices



















                                          8
































































                                          9








                                             CHAPTER 1



                             GETTING TO KNOW THE SYSTEM




          This chapter provides a general description of the DECtalk PC
          system with the latest DECtalk PC speech microcode. It covers
          the DECtalk PC module and its configuration, controls, commands,
          and operating mode.

          The DECtalk PC DTC07 System
          The DECtalk PC is intended for use in single user voice
          communication systems. The PC sends text to the DECtalk card,
          and the resulting speech can be heard through an external
          speaker.

          The DECtalk PC is a text-to-speech converter.  This converter is
          an IBM PC board which contains all of the text-to-speech
          capabilities.
          The complete system includes the following components:

          (1)     DECtalk PC  speech synthesizer board   in an anti-static
          envelope                        (Part  No. DTC07-AM)
          (2)     External Loudspeaker. (Part No. 70-29613-01)

          (3)     Installation Manual.  You are reading from this manual.
          (Part                   No. AA-PHHPA-TH)
          (4)     One  720 Kbyte 3 " diskette which  contains an
          ASCII version of the installation manual. as well as the
          software  to                    load into your DECtalk PC (Part
          No. AK-PHHWA-BH)

          (5)     Two 360 Kbyte 5 " floppy disks  which contain an  ASCII
          version of              the installation manual. as well as the
          software  to    load into your PC-                      DECtalk.
          (Part No.  BI-PHHNA-BH)
                  The contents of these diskettes are the same as the 3
          1/2" diskette.

          (6)      Audio cassette  which is a spoken version of this
          Installation                         Manual (Part No.
          AO-00003-0A.A01.
          (7)     Getting Started Reference Card   in  Braille as well
          as regular print (Part No. AV-PHHQA-TH).

          (8)     Warranty Registration Card. Please  fill out and return
          this pre-ad                     dressed registration card. This
          will also make it possible to alert you to                   any
          upgrades or changes.



                                         10








          The DECtalk PC Module

          The DECtalk PC module is a unit for converting text to speech.
          The board plugs into a standard IBM PC or PC compatible in a
          full-length option slot through an EISA/ISA 8/16 bit bus
          interface.
          The major components on the board are as follows:

                  80C186 Microprocessor (Main Microprocessor)
                  DMA Interface

                  BIOS ROM
                  TMS32010 (Digital Signal Processor)

                  Audio Amp
          Power Supply Identification

          The modules receive power through a common power supply in the
          PC. The power switch on the system unit of the PC  powers up the
          board. The DECtalk PC board draws only around 2.5W. However,
          because of the fact that most systems will contain a number of
          other boards, it is recommended that the power supply on the
          system be a minimum of 130W.  The IBM/ATtm is shipped with a 170
          W (or 250 W) power supply. The IBM/XTtm is shipped with a 130 W
          (minimum) power supply. The wattage is usually printed on the
          top of the power supply. If the IBM/PC has a hard disk or a hard
          card, the IBM recommended supplier should have upgraded the
          power supply to 130W or greater.
          Power Requirements

          The power requirements are as follows:
          DC Amps 500ma @ +5V

          DC Amps 50ma @ -12V
          DC Amps 125ma @ +12V

          Bus Loads: 1


          Module Configuration
          At power-up, the module controller consults the DIP  switchpack
          to determine the required operating  characteristics. Switch
          functions and I/O and IRQ  settings are described in Appendix A
          of this manual. The DECtalk option module card comes with all
          switches set to default settings. (These are set at the
          factory.) The DECtalk board should be able to be installed
          without a conflict. However, when a number of other options are
          already installed in the PC, a conflict may arise and the
          DECtalk module will have to be configured to avoid the conflict.
          This is done through changing the settings on the DIP-switch
          pack as described in Section 2 of the Installation Manual.

          Module Controls



                                         11








          The interface with the DECtalk PC  module is through a Terminate
          and Stay Resident (TSR) memory driver.  See Chapter 3  for
          further details.

          Software
          In addition to the PC board, DECtalk PC comes with software on
          3 " and 5 1/4" floppy disks. The floppy disks contain the PC
          software and speech microcode necessary to load and run the
          text-to-speech synthesis. The software on the 3 " diskette is
          identical to that on the 5 1/4" diskettes. The two different
          size diskettes are provided for the convenience of users who may
          have one type of drive but not the other.  The installation
          manual which comes with your DECtalk PC contains a description
          of the components of the system as well as instructions on
          downloading the software to the board.

          Installation.
          Installation is initiated by inserting the distribution diskette
          and typing a simple command  INSTALL. Product installation will
          create/update the necessary system files (i.e. autoexec.bat).
          Successful installation will cause the board to speak its
          startup message. Installation will initially load the board with
          its software and prompt the installer, via both tones and screen
          menus, through the install procedure.  Installation will give
          the user the option to select a default or variable installation
          mode. The variable installation mode allows the user to change
          defaults such as the default drive and sub-directory.

          Power-Up Self-Test

          After successfully downloading the DECtalk PC software (see the
          DECtalk PC Installation Manual), DECtalk will be ready to speak.

          If the system has been powered up and the software downloaded
          and DECtalk does not speak, make sure the power cord is plugged
          in and the system unit is powered on.
          At power-up, the self-test first checks the power supply. If the
          power cord and the ac outlet are both fine, but the DECtalk
          still does not speak check that the power is on. Many PC's have
          a LED on the power-switch.

          At successful power-up, the PC will display the following
          diagnostic message from the DECtalk BIOS program on the screen:


                  DECtalk PC adapter at address : D800 BIOS V0.75
                  Copyright (c) 1991 Digital Equipment Corp.

                  All rights Reserved
                  Board 0; Tests 0; IO address 340, IRQ 3


          The DECtalk PC will also say:



                                         12








                  "DECtalk PC V4.0 is running."

          In the event that the DECtalk PC has failed, a diagnostic
          message will appear on the screen. If this occurs,  refer to
          Section 2 of the Installation Manual for the correct procedure
          to have the problem solved.
          Operating Mode

          The DECtalk PC module is ready to operate when the successful
          power-up message is spoken.  In this state, the DECtalk board is
          controlled by the PC and is in operating mode.
          System ROM BIOS, I/O, and IRQ Identification

          After any system reset, the PC BIOS will begin to execute POST.
          In a search for adapter cards installed in slots with
          system-accessible ROM, POST will hunt for the 55AA signature at
          the base of a 2K block at addresses in the range from C0000
          through E0000.
          DECtalk PC board BIOS ROMs can be located at one of several base
          addresses:

                  o C8000
                  o D0000

                  o D8000 (Default)
          as set via DIP-Switch, SW1. BIOS POST will call each of the
          installed DECtalk PC boards' ROMs.

                  Each DECtalk PC board has an I/O port located at one of
          several addresses:
                  o 240

                  o 250
                  o 340   (Default)

                  o 350
          as set by switches. IRQ levels for each board can be selected as
          level 3 through 7 (with switches).

                  The first DECtalk PC ROM called by BIOS POST will:


           Look for all installed DECtalk PC boards (including itself)
            using I/O port
           Tell each DECtalk PC board to initiate a self-test sequence

           Retrieve the status of each DECtalk PC self-test results
           Determine the hardware IRQ for each installed board

           Build a DECtalk PC parameter block in the BIOS RAM area
           Grab the INT vector for DECtalk PC BIOS and initialize with
            the address of the BIOS entry




                                         13








           Far return to the called (BIOS POST)


          Each additional (if any) DECtalk PC ROM called by BIOS POST will

           Test for the presence of a non-null DECtalk PC INT vector
           Far return to called (since the first board already installed
            the vector)

          Since all DECtalk PC ROMs are identical, it does not matter
          which is selected
          first by the BIOS POST.


          BIOS Parameters block


          This table is created only if there is (at least) one DECtalk
          board installed in the system. The BIOS parameter block is based
          at dtpc_bpb and is formatted as follows:

                  ADDRESS CONTENTS
                  DTPC_BPB+0      STATUS FOR INSTALLED BOARD  0

                  DTPC_BPB+1      STATUS FOR INSTALLED BOARD  1
                  DTPC_BPB+2      STATUS FOR INSTALLED BOARD  2

                  DTPC_BPB+3      STATUS FOR INSTALLED BOARD  3
          The contents of each bpb status byte is:

                  Installed               Tests          IRQ     IO
                          BPB STATUS BYTE

          The field values are encoded as follows:
          Field           Description             Encoding

          Installed               Is the board present?   0: no board; 1:
          installed
          Tests           What are self-test results?         0: success;
          1..3: error code

          IRQ             What is the IRQ?                 0..1: illegal;
          3..7: IRQ level
          IO              What is the I/O address?        0: 240; 1: 250;
          2: 340; 3: 350

          FINDING A BOARD
          This is a hunting sequence executed by the host processor
          calling the adapter BIOS at the initialization entry. For i = 0
          to 3:

           Read valuei  port at addresses 240, 250, 340, 350
           If  valuei  = MODULE_init at a port address, update bpb entry



                                         14








          SELF TEST

          If a board was found by the hunting sequence, the BIOS
          initialization code then issues a CMD_self_test command to the
          80186 side of the board. The 80186 boot code tests the module
          and returns a test result which the BIOS unit init code stores
          in a bpb entry.


          IRQ SENSING


          If self-test for the board is successful, the IRQ level
          (selected by SW1) for the board will be determined by the BIOS
          init code issuing CMD_isaint_on to the 80186 (which is in the
          wait_id() loop) and "catching" the resultant interrupt on one of
          the possible IRQ levels. During the interrupt service code,
          CMD_isaint_off is sent to the 80186. The bpb entry is updated as
          well.
          BIOS INTERFACE

          User application programs (typically executed at autoexec.bat
          execution time) can send commands to the DECtalk PC BIOS. Once a
          device driver and/or TSR is loaded, it is expected that the BIOS
          interface will not be used. An example of a call to the DECtalk
          PC BIOS is as follows:


                  mov  ah,0               ; get bpb_base address
                  int      65h            ; DECtalk PC BIOS vector


          A list of DECtalk BIOS functions is as follows:

                  00      Read bpb_base address
                  01      Self-test start

                  02      Reset module


















                                         15
































































                                         16








                                      CHAPTER 2



                                HOW DECtalk PC WORKS


          This chapter describes how DECtalk PC converts ASCII data into
          voice  output.

          CONVERTING TEXT TO SPEECH
          You enter text and commands into the PC via a keyboard or other
          input device. The PC can then send this ASCII text to  DECtalk
          PC through the PC bus. DECtalk PC converts this data into
          speech by a three-level process.

          Level 1
          DECtalk PC first accepts text from the PC and converts the text
          from one code into another. The text  is in ASCII format when it
          enters DECtalk PC, and is  converted to phonemic code for
          further processing.

          Phonemic code uses the phonemic alphabet described below. Each
          symbol in the phonemic alphabet has only one pronunciation.
          DECtalk PC uses an internal dictionary and the rules of English
          pronunciation to perform this conversion.
          Level 2

          The phonemic code is converted into synthesizer control
          parameters. These are continuous variables which control
          aspects of the speech such as pitch, amplitude, duration and the
          like for the various DECtalk PC voices.
          Level 3

          The speech synthesizer uses the control parameters to generate a
          speech waveform. This waveform is converted to an analog speech
          signal through a D/A converter.
          In Levels 2 and 3, a synthesizer control command (a set of
          phonetic parameters) is generated  every 6.4 milliseconds, and
          the digital signal processor generates a speech waveform value
          every 100 microseconds. This process  generates "frames" of
          speech. DECtalk PC acts somewhat like a TV  picture in that
          these frames of speech are presented to the  listener just as
          frames of pictures are presented to the viewer.  In both cases,
          the frames appear to be one continuous, unbroken  sequence.

          DECtalk PC SOFTWARE PROGRAM
          The three-level process described in the last section happens in
          a number of discrete program modules. Each module is described
          briefly in the following paragraphs.






                                         17








          Converting ASCII Text to Phonemic Code

          1. A sentence parser breaks the input stream into separate words
          and locates some clause boundaries (indicated by  commas and
          other punctuation marks as well as
          special words). The sentence parser  also recognizes and deals
          with phonemic

          symbols and  commands that you may have added to the input text.
          Phonemics
          are discussed below.

          2. A word parser breaks words into their component parts,
          yielding words in their final pronounceable form. Strings of
          text that do not form pronounceable English words are  spelled
          out letter by letter. A number formatter is used if the text con
          tains numerals. The number formatter knows the rules for many
          common number

          formats and converts the numbers into English  words. The number
          formatter also
          recognizes many common abbreviations, such as "lb." for
          "pound(s)." Number-

          speaking rules are discussed later in this manual.
          3.. A dictionary lookup routine searches the pronunciation
          dictionaries. DECtalk

          PC has a built-in dictionary of many  commonly used words.
          DECtalk PC also
          has a definable  dictionary for developers  and users that can
          be filled with words specific to an application. This dictionary
          and its loading are described below. While this version of
          DECtalk has greater pronunciation accuracy than its predecessors,
          it may sometimes be necessary  to send the DECtalk card the
          correct phonemics for words important for a particular
          application. This can be done using the developer-definable
          dictionary. Dictionaries are searched in the order in which they
          have been loaded onto the DECtalk board.

          4. A letter-to-sound module uses a set of English pronunciation
          rules to assign phonemic form and lexical  stress patterns to
          words not found in the dictionary.  See the chapter  on
          Phonemics for ways to modify the phonemic form of words, and
          subsequent chapters for special voice qualities (such as
          emphasis and singing).
          5. A phrase structure module recombines all phonemic output from
          the dictionary search and other modules. Durations  of phonemes
          and pitch commands are computed for the  clause, and appropriate
          sound variants are selected for  those phonemes that can be
          pronounced in more than one  way.

          Converting Phonemic Code to Synthesizer Control Commands



                                         18








          6. The phoneme-to-voice module processes clauses passed from the
          phrase structure module and converts them to control  signals
          for the speech synthesizer. This module modifies the clauses by
          changing the phonemes/allophones into parameters that determine
          the  natural resonant frequencies of the vocal tract
          (formants), sound source amplitudes, and the like. The control
          parameters are sent to the speech synthesizer for output.

          Converting Control Commands to Speech
          7. The digital speech synthesizer computes a speech waveform
          with acoustic characteristics that are determined by the
          synthesizer control commands received. Replicating the sounds of
          human speech in a natural way is an extremely difficult task.
          Dozens of acoustic parameters and thousands of values of these
          parameters have to be taken into account.  DECtalk PC is widely
          believed to be the best English language synthesizer available
          anywhere for intelligibility and naturalness.


          DIFFERENCES FROM OTHER SERIAL LINE DECTALKS


                  Many users may be familiar withh the serial-line DECtalk
          speech synthesizer. The DECtalk PC with DECtalk software is
          compatible with the linguistic capabilities and high quality
          speech of other DECtalks. However, there are a number of
          differences and improvements.


                  GENERIC FUNCTIONALITY

                  Compatibility mode

                  All DTC01/3 phonemic and voice commands  are supported.

                  PC-Board

                  DECtalk PC is a PC bus system rather than a serial line
          system like its predecessors. It therefore can easily plug into
          any IBM-PC, PC/AT or true PC compatible and can be used with
          software written for the PC.
                  Softloadable

                  DECtalk PC is a card plus software which can be
          downloaded onto the card.  Software (programs and text) comes in
          two distribution sizes: (a)  760Kb 3 " mini floppy disk (one of
          these); (b) 360Kb 5 " floppy disk (two of these).
          Softloadability provides you with more flexibilty for future
          upgrades.


          Latest Version of Speech Synthesis
                   DECtalk PC software contains the latest version of



                                         19








          DECtalk speech synthesis. This incorporates a number of
          improvements and fixes to the firmware used in earlier ROM-based
          systems.


                  Settable Volume Control

                  DECtalk PC contains settable volume control both in
          hardware and in software. There is a volume control on the
          loudspeaker which comes with the PC- board and there is also a
          volume control which is settable by a command sequence.
                  Tone Generation Capability

                  DECtalk PC will generate certain tones (e.g., for margin
          bell, alert etc.) in addition to speech sounds. It also
          maintains its ability to sing and the developer still has the
          flexibility of modifying acoustic parameters such as pitch,
          duration and the like to create different voice qualities.


                  Immediate Stop Speaking
                  DECtalk PC allows the application to terminate speech
          immediately instead of waiting for the buffered text to
          complete. Stop Speaking can be accomplished bby commands such as
          [:pause] which acts immediately. DECtalk will still accept
          commands.  DECtalk also has the ability to resume speaking where
          the text last off or to flush text and immediately start
          speaking new text.

                  Letter Mode, Word Mode and Clause Mode.
                  DECtalk PC is able to immediately speak single
          characters without waiting for an entire clause to be buffered.
          This is useful in applications requiring tactile feedback for
          what was typed on the keyboard. It also provides normal clause
          buffering for highly natural speech. DECtalk can speak letters,
          words, phrases, clauses, paragraphs and whole documents.

                  Shorter Command Strings
          Many of the command strings such as change rate, change voice,
          start, stop, index, index reply and the  like have the ability
          to  be shortened for greater ease of use in applications.

                  Increased Buffer Size
          Buffer size in DECtalk has been significantly increased. The
          input buffer size is 4Kbytes and the output buffer size is
          4Kbytes.

                  High Quality Speech
          DECtalk's speech retains its high quality. In addition, a number
          of improvements have been made in functionality, acoustic
          phonetic quality and naturalness.

                  Increased Word Pronunciation Accuracy



                                         20








                  The accuracy of word pronunciation is higher than in any
          previous version of DECtalk. Words are rarely mispronounced.
          There have been significant improvements in the accuracy and
          quality of letter-to-phoneme rules. Also, DECtalk PC contains
          very large built-in dictionaries which assist both the
          pronunciation of individual words as well as its rhythmic
          naturalness. The fixed dictionary in DECtalk PC software is many
          times larger than any previous DECtalk dictionary and the
          entries are more complex and contain a wider variety of
          information which will help to  increase naturalness. As in the
          past, such fixed dictionaries are inaccessible to the user
          although the user dictionary is accessible and modifiable.

                  Improved Pronunciation Heuristics
                  Certain heuristics have been improved and made more
          intelligent. For example, DECtalk PC is able to better recognize
          and parse unpronounceable sequences such an uppercase
          initialisms (FBI,  AAA, etc.) in addition to the normal
          unpronounceable sequences such as those with no vowels (CBS,
          NBC).

                  Larger User Dictionary
                  DECtalk PC contains a user dictionary. This dictionary
          can be used to load application-specific words, DOS-specific
          terms, and the like.  This dictionary is now much larger than
          those of the earlier DECtalks.  The size is variable and depends
          upon what other software is resident on the board.  However, the
          usable space may be as high as 350K bytes and would be
          sufficient to load thousands of words. Because of the large
          dictionary, developers can now input many keyboard key names and
          commonly used DOStm and PC application words and commands (e.g.,
          autoexec.bat, config.sys, etc.).

          Note: Dictionaries will be searched in the order in which they
          are downloaded.





















                                         21










                  Faster Speech Rate
                  Speech rate on DECtalk PC runs from a slow speed of 120
          wpm to an upper rate of 550 wpm. This is 200 wpm faster than
          other DECtalks and thus is more useful for applications where
          scanning large bodies of text is necessary.


                  New Name Pronunciation Capabilities

                  One of the items most requested from previous versions
          of DECtalk has been an improved ability to pronounce proper
          names such as first names, last names and street names.  DECtalk
          is now capable, for the first time, of pronouncing proper names
          with a high degree of accuracy and greater level of
          intelligence. The rules behind the name pronounciation routines
          were originally developed for large commercial telecom
          applications but have now been modified to run on the DECtalk
          PC.  This can be run in different modes and should be a great
          help to applications which require peoples names and addresses
          to be pronounced correctly.
                  Digitized Voice Output

          Dectalk PC has the capability of outputting digitized speech in
          addition to synthesized speech. You may digitize a voice segment
          with one of the available digitizing boards on the market and
          can store such speech in a file on the PC and play it back
          through the DECtalk.



























                                         22










                                                  CHAPTER 3

                  HOW TO COMMUNICATE WITH DECTALK



          COMMUNICATING WITH THE DECTALK PC

                  There are two basic ways to communicate with the DECtalk
          PC. You may treat it as a standard ASCII device  like the DTC01
          (stand-alone DECtalk) connected as either a serial printer or a
          parallel printer. You may also communicate directly with the TSR
          driver DT_DRIV.EXE that supports  the DECtalk. Both types of
          functionality can be present simultaneously so it is possible to
          utilize both paths at the same time. Synchronization of commands
          and data if both paths were utilized would be the responsibility
          of the application.


          Driver Configuration Options

                  The DT_DRIV.EXE program will take various arguments to
          allow for different desired configurations. The default
          configuration for DECtalk PC is a base address of 340, an IRQ of
          3, and serial trapping enabled on port 3 (COM4). The following
          is a list the arguments that the driver program accepts.



                  -C  D           Sets the com port to number D (1 = COM2,
          2 =                             COM3, or 3 = COM4)
                  -L  D            Enables parallel printer trapping (1 =
          LPT2, 2 =                               LPT3)

                  -B  XX  Sets the base module address to XX (hex num
          ber). This command must be used in conjunction
          with the -I command (below).
          -I   D          Sets the IRQ to D
                  -R      Removes the driver
          -V              Calls for verbose printout of information by the
          driver.



          Serial and Parallel Communication

                  The DECtalk driver will trap any interrupts for
          whichever device it has been configured to emulate and  should
          appear to the calling program just like a normal serial or
          parallel card at the BIOS level. Since there is no parallel or
          serial hardware,  baud rate and the like will be emulated and



                                         23








          echoed, but nothing is done with these. In emulation mode, the
          DECtalk PC will operate just as quickly with baud rate set at
          110 baud as it will at 19.2kbaud. This manual will not address
          how to make BIOS calls to a serial or parallel device as that
          information is generally available and the details are specific
          to the compiler being utilized, rather than to the DECtalk PC.
          For informational purposes, the definitions for the comm port
          bits have been included in the DTTSR.H file.


          Communicating Directly with the TSR.

                  To talk directly with the TSR, the application must
          utilize the 2F multi- threaded interrupt. The call is made using
          a INT86 structure with the ah=DECTALK_ID and the al = the
          function code. The bl contains any parameters that need to be
          passed along with the command. The Status command returns the
          status of the DECtalk and not comm status. The dx contains the
          comm status infomration in this case, so you cab get the comm
          status from the TSR status command. The accompanying program
          examp1.c  provides some simple illustrations of the invocation
          of these commands. As an example of a COM BIOS call, the status
          information was obtained by an INT14 in this example.


                                 DECTALK TSR FUNCTION CODES


          Function                Codes

          DECTALK_ID                              0xD0
          Id to pass to the multi-threaded interrupt (INT2F)

          INSTALL_CHECK                   0x00
          Is the TSR Installed??

          INSTALLED                               0xFF
          return in al

          DECTALK_EXIT            0x01
          Remove the TSR and restore things to their original state.

          DECTALK_TEST                    0x02
          Simply return a TSR_SUCESS if anybody is there.

          GET_STATUS                      0x03
          Get current status. If you want comm status information it is
          sitting in the dx registers.

          VOLUME_UP                       0x04
          Increase volume. Increase volume by amount specified in .bl

                        DECtalk PC Voice Control Command Set



                                         24








          VOLUME_DOWN                     0x05
          Decrease volume.  Decrease volume by amount specified in .bl.

          VOLUME_SET                      0x06
          Set volume. Set volume to amount specified in bl (0-100)

          Reserved                        0x07
          /* allocate memory */

          SEND_CHAR                       0x08
          Send a character. Send the character in the .bl to the DECtalk.

          GET_CHAR                        0x09
          Get a character. Get a character may cause unpredictable results
          if the TSR does         not have a character.


          SEND_BUFF                       0x0a
          Send a character buffer. Send a buffer of information.  Assumes
          a fixed buffer size of 256.  You segment address is in the .dx
          and the offset address  is in the .bx
          The segment and offset passed in the .dx and .bx registers is a
          pointer to a structure called DECTALK_CHAR_BUFF found in the
          DTTSR.H file. It contains a count and a pointer to the actual
          buffer (256 words maximum).
          GET_BUFF                        0x0b
          Get a character buffer. Get a buffer of information. Assumes a
          fixed buffer size of 256.   You segment address is in the .dx
          and the offset address  is in the .bx. The segment and offset
          passed in the .dx and .bx registers is a pointer to a structure
          called DECTALK_CHAR_BUFF found in the  DTTSR.H file. It contains
          a count and a pointer to the actual buffer (256 words maximum).



          reserved                        0x0c
          reserved                        0x0d
          reserved                        0x0e
          reserved                        0x0f
          reserved                        0x10
          reserved                        0x11
          reserved                        0x12
          reserved                        0x13
          reserved                        0x14


          PAUSE_OUTPUT                            0x15
                           Pause output.

          RESUME_OUTPUT                           0x16
                           Resume talking after a pause.

          FLUSH_TEXT                      0x17



                                         25








                  Flush pending text. This is an asynchronous flush and
          cannot
          guarantee to flush every character in process, if sending
          buffers. This will always work, however, if a flush is also then
          inserted in the text stream. The stream flush is recommended
          when it can be used. This is provided so that if the Dectalk PC
          is I/O bound with large text buffers, it can be flushed quickly,
          asynchoronously to the text stream.  (See [:flush all] command
          below).

          DIGITIZED_MODE          0x18

                  Put DECtalk into digitized speech mode.

          TEXT_MODE                       0x19
                  Put DECtalk into text-to-speech mode.

          DIGITIZED_DATA          0x1a

                  Play digitized data. The segment of the data block
          pointer (which
                  contains tha adrress of the buffer and the count) is in
          the dx
                  and the offset is in the ax.

          FLUSH_SPEECH                    0x1b
          Flush speech but process all commands until an [:enable] is
                  seen. Then start speaking again. This command is slower
          than
                  the flush_text command.



          SPEECH CONTROL

          There are three ways to control speech.

                  1.  Through English text (sentences in standard English
          format and                          spelling).      DECtalk PC
          speaks this text as  written. Chapter 4 discusses
          this in more detail.
                  2.  Through phonemic spelling (sentences or phrases
          written in phone                         mic symbols). Phonemic
          spelling is closer to the  actual pronunciation
          of the text. Phonemic spelling is always enclosed in square
          brackets,                       e.g., If you were to phonemicize
          the sentence This is an example of                      phonemic
          spelling, it would look like this:  [dh'ihs ihz  axn
          ixgz'aempel axv  faxn'iymixk sp'ehlixnx]. Chapter 5 discusses
          this in more detail.

                  3.  Through special  commands. Commands control features
          of speech                       that are not  obvious from the



                                         26








          visible  text, such as  sex of the speaker,
          and  excitement level.  Appendix D discusses this in more
          detail.

          Voice Control Commands.
          DECtalk PC supports DECtalk DTC01 voice control commands such as
          phonemic text, change of voices, speaking rates and the like.

          Note: DECtalk PC  supports only multi-character (Arpabet)
          phonemic mode.
          A simpler user interface command set has been written to allow
          software developers more flexibility in manipulating the various
          parameters of DECtalk speech. These commands allow for such
          functions as select voices, select speaking speeds, stop
          speaking, generate tones and so on.  These  perform the same
          functions as the escape sequences used in DTC01 wherever those
          functions existed.

          There are a number of new features offered by DECtalk PC:
                  o Nine predefined voices (four male, four female and one
          child).

                  o Definable voice
                  o Speaking rates from 120 wpm to 550 wpm

                  o Ability to say letters, words or phrases
                  o Pause,  Continue and Stop speaking control

                  o More accurate letter-to-sound pronunciation rules
                  o Large internal (fixed) dictionary

                  o Expanded developer-definable dictionary
                  o Punctuation control for pauses, pitch and stress

                  o Output volume control


          DECTALK PC SPEECH FUNCTIONALITY COMMAND SET

          The following is an alphabetized list of commands for DECtalk
          PC. Each entry consists of (a)   Function:  what the command is
          supposed to accomplish; (b)   Command: the text you type in to
          tell DECtalk PC what to do. All short commands beginn with a
          '[:" and end with a ']".  Any unique substring of the command is
          valid, e.g., [:ra 180] is equivalent to [:rate 180], {;co is
          sufficient for [:comma pause], and so on.  (c) Escape Value: the
          escape value for the command. (Escape sequences used on previous
          versions of DECtalk may still be used in compatibility mode);
          (d)  Options:   The options which you may choose from; (e)
          Parameters:  the required parameters for those options; (f)
          Default; (g) Examples: (h) Description: a non-formal description
          of what the command does along with  examples where pertinent.




                                         27








          Note: Commands are synchronous unless otherwise stated. Also,
          the use of several commands may interact with each other and
          effect the output.

          Note; If an incorrect syntax is sent in a command, the closing
          bracket may be ignored as it might be considered part of the
          illegal string.  To recover from this situation, you will need
          to send an extra "]" to the parser. To help avoid this
          situation,  set the [: error command on.


          FUNCTION:                               COMMA PAUSE
          COMMAND:                                [:comma DD]

          ESCAPE VALUE:                   202
          OPTIONS:                                None

          PARAMETERS:                             Pause time in
          milliseconds
          DEFAULT:                                160 ms.

          EXAMPLES:                               [:comma 250]
                                                  ESC P0 ; 202 ; 250 z

          DESCRIPTION:
          Comma pause may be incremented and decremented. [:cp 0] resets
          the comma pause to its default state (approximately 160 ms.)
          Comma pauses can be incremented by 30000ms and decremented by
          -40 ms.) All  values outside legal range will default to the
          nearest legal values.

          __________________________________________________________________


          FUNCTION:                               DEFINE VOICE
          COMMAND:                                [:define  XX DD]

                                                  [:dv XX DD]
          OPTIONS:                                N/A

          PARAMETERS:                             N/A
          ESCAPE VALUE:                   N/A

          DEFAULT:                                Parameters for Paul
          Voice
          EXAMPLES:

          [:define voice ap 120]
          DESCRIPTION:

          This command allow the user to set speech parameters such as
          pitch range, (for a greater excitement level) head size (for a
          deeper voice) and the like. XX is a mnemonic which stands for an



                                         28








          acoustic parameter. DD is some decimal value of the parameter.
          See Appendix D for a list of these parameters. For example, to
          change the average pitch to 120, type [:dv ap 120]. To save
          these parameters to the variable val voice, you may use the [:dv
          save] command.  Then, when you type [:nv] or [:name val] these
          parameters will be saved.

          Note: You should leave a space between the parameter name and
          the value. Thus, to set the pitch range to 0, type [:dv pr 0]
          and not [:dv pr0].
          __________________________________________________________________

          FUNCTION:                               DIGITIZED SPEECH
          COMMAND:                                [:digitized]

          OPTIONS:                                None
          PARAMETERS:                             None

          ESCAPE VALUE:                           800


          DESCRIPTION:
          Used to synchronize digitized data with the text stream.

          __________________________________________________________________
          FUNCTION:                               DIAL TONES

          COMMAND:                                [:dial 'X"]
          ESCAPE VALUE:                           400

          OPTIONS:                                None
          PARAMETERS:
          String of dial characters (0-9, A,B,C,D,#,*,(comma)) bounded by
          quotation marks

          DEFAULT:                                None
          EXAMPLES:                               [:dial "508-555-1212"]

                                                  ESC P 0 ; 400 ; 101 z
          This command generates tones called Dual Tone Multiple Frequency
          (DTMF) Tones or Touch-Tonestm. The tones are the touch-tones for
          0-9, *, #, "," (comma) and A, B, C, D (in uppercase onl;y) for
          handsets which contains these. The comma can be used to generate
          a 2-second  pause. These tones can be used to dial a telephone.
          Quotation marks are required around the digits      and d ashes
          should be used where appropriate in telephone numbers. The dial
          tone command is asynchronous.

          __________________________________________________________________


          FUNCTION:                               ENABLE
          COMMAND:                                [:enable]



                                         29








          ESCAPE VALUE:                           14

          OPTION:                         None
          PARAMETERS                              None

          DEFAULT:                                On
          DESCRIPTION:

          Enables speaking after a selective text flush. Tis command is
          needed only after a flush text or if a TSR call of flush_speech
          is used.
          __________________________________________________________________


          FUNCTION:                               ERROR

          COMMAND:                                [:error X]
          ESCAPE VALUE:                           300

          OPTIONS:
          Ignore (0):     Ignore all errors

          Text (1):               Send errors back as text strings of the
          form: [:error <type>]
          Escape (2):     Send errors back as escape sequences of the
          form: ESC P 0 ;                         300 ; <error code>

          Speak (3):      Speak error string in current voice, rate, etc.
          Tone (4):               Generate error tone.

          PARAMETERS:                             None
          DEFAULT:                                Off


          DESCRIPTION:

          This command sets the error mode for the module. This command is
          useful for debugging in an application development setting.
          __________________________________________________________________


          FUNCTION:                               FLUSH

          COMMAND:                                [:flush]
          ESCAPE VALUE:

          10
          OPTIONS:

          All (0):  Flush all text. (see flush_text above)
          Until (1): Flush until the specified index mark is found.

          Mask (2): Flush until mask and index bit coincide.



                                         30








          After (3): Flush all text after index mark.

          Speech (4): Flush all speaking but continues to process
          commands. This mode is                     ended by the enable
          command.
          PARAMETERS:

          Index mark value or mask value
          DEFAULT:                                None

          EXAMPLES:
          [:flush after 193]

          ESC P 0 ; 10 ; 3 ; 193 z
          DESCRIPTION:

          This  command allows speech to be discarded. All pending but
          unspoken text is lost, including index markers that may have
          been sent by the PC. This  command stops speech, even if DECtalk
          PC is  in the middle of a sentence. Any pending but unspoken
          text is lost, including index markers that may have been sent by
          the PC.  Speech stops and all internal buffers are
          reinitialized. The flush command is asnychronous.
          __________________________________________________________________

          FUNCTION:                               INDEX
          COMMAND:                                [:index XX DD]

          ESCAPE VALUE:
          Index Mark (20)

          Index Reply (21)
          Index Query (22)

          OPTIONS:
          Mark:   Insert mark into text and current position.

          Reply:  Insert mark and reply when encountered.
          Query:  Respond with last encountered remark.

          PARAMETERS:                             Index Mark Value
                                                  Index Reply Value

                                                  Index Query Value
          DEFAULT:                                None

          EXAMPLES:                               [:index reply 123]
                                                  ESC P 0 ; 21 ; 123 z


          Text sent to DECtalk PC can contain index marks. DECtalk PC
          remembers  these marks when they are spoken. The application can
          listen  to the spoken text (by reading the value of the last



                                         31








          index) to  determine how much transmitted text was actually
          spoken. Index  markers are truly marks; they do not modify
          heuristics or word pronunciations in any way.  The mark sequence
          inserts an index marker (flag) in the text  stream sent to
          DECtalk PC. Mark simply marks a position in the text. Reply
          marks  a position, but also has DECtalk PC inform the PC  when
          the  index is spoken. When DECtalk PC speaks the reply sequence,
          it sends a reply    to the  PC.  Note: The Index Reply will
          always be the same number as the Index.   Index Query requests
          DECtalk PC to reply to the  PC with  the last index marker
          spoken (that is, the last portion of spoken  text that had an
          index marker).  It will send back [:index xxx].

          __________________________________________________________________


          FUNCTION:                               LOG
          COMMAND:                                [:log XX YY]

          ESCAPE VALUE:                           81
          OPTIONS:

          On (0):         Absolute
          Off (1):                Enable mode

          Set  (2):               Disable mode
          Text: (3):              Log all text except escape sequences

          Phonemes: (4):  Log converted phonemic text
          PARAMETERS:                             None

          DEFAULT:                                Off
          EXAMPLES:                               [:log text on]

                                                  ESC P 0 ; 81 ; 1 z
          DESCRIPTION:

          This commmand sets text logging modes for the module. The log
          command controls DECtalk PC logging of input text. This  command
          allows DECtalk PC to send the phonemes corresponding to the
          input text back to the  PC. The PC must be prepared to receive
          the characters when  they are sent, or they will be lost
          __________________________________________________________________

          FUNCTION:                               MODES
          COMMAND:                                [:mode XX YY]

          ESCAPE VALUE:                           80
          OPTIONS:

          On (0):         Absolute.
          Off (1):                Enable mode




                                         32








          Set: (2):               Disable mode

          Math (4)                  Change interpretation of selected
          symbols
          Europe (8):     Select European monetary pronunciation

          Spell (16):     Spell all words
          Name (64)       Pronounce all proper names (see also [:pronounce
          name])

          Homograph (128) Reserved for Future Use
          PARAMETERS:                             None

          DEFAULT:                                Off
          EXAMPLES:                               [:mode spell on]

                                                  ESC P 0 ; 80 : 20 z


          When mode is set to    Europe, "," is a separator between the
          integer and fraction part  of a number and "."  is a separator
          between 3-digit  blocks. Ex: 1.255 (US) = 1,255 (Europe);
          125,873 (US) = 125.873 (Europe).  1 = set; 2 = clear.       Math
          takes ambiguous characters and pronounces them with mathematical
          meanings. Specifically, the following characters are treated
          differently:


                  Character               Clear          Set
                  +               plus            (no change)

                  -               dash            minus
                  *               asterisk               multiplied  by

                  /               slash           divided by
                  xxE-xx          (spelled)              (scientific
          notation)

                  ^               carat           to the power of
                  <               left angle bracket     less than

                  >               right angle bracket    greater  than
                  =               equals          (no change)

                  %               percent         (no change)
                  .               period          point


          Mode name allows uppercase words which occur in non sentence
          initial position to be interpreted as special cases and
          pronounced as names.

          Note: Do not enable Mode Name except when pronouncing lists of



                                         33








          names only. Mode Name will interpret any uppercase word as a
          name. When finished, make sure that this mode is set to OFF.
          For the occasional use of this utility,  use the [:pronounce
          name] command (below).

          Mode commands are asynchronous.


          __________________________________________________________________
          FUNCTION:                               PAUSE

          COMMAND:                                [:pause DDD]
          ESCAPE VALUE:                   12

          OPTIONS:                                None
          PARAMETERS:             Pause time in milliseconds; 0 = forever.

          DEFAULT:                                N/A
          EXAMPLES:

          [:pause 200]
          ESC P 0 ; 12 ; 200 z

          DESCRIPTION:
          This  command pauses the audio output of the module. Any pending
          but unspoken text is retained, including index markers that may
          have been sent by the PC.  The user now has the option of
          flushing all text or resuming speech where it was left off.
          DECtalk will not speak until such a command is given. The pause
          command is asynchronous.

          __________________________________________________________________
          FUNCTION:                               PERIOD PAUSE

          COMMAND:                                [:period  DD]
          ESCAPE VALUE:                   203

          OPTIONS:                                None
          PARAMETERS:                             Pause time in
          milliseconds

          DEFAULT:                                640 ms.
          EXAMPLES:                               [:period 250]

                                                  ESC P 0 ; 203 ; 250 z
          DESCRIPTION:

          Period pause may be incremented and decremented. [:pp 0] resets
          the period pause to its default state (approximately 540 ms.)
          Period pauses can be incremented by 30000ms and decremented by
          -380 ms.) All  values outside legal range will default to the
          nearest legal values.
          __________________________________________________________________



                                         34








          FUNCTION:                       PHONEME INTERPRETATION

          COMMAND:                        [:phoneme (parameters)]
          ESCAPE VALUE:                   600

          OPTIONS:
          On (0):         Set phonemic interpretation on

          Off: (1):               Set phonemic interpretation off


          Asky (2)                Reserved for Future Use
          Arpabet (3):    Set phonemic interpretation to arpabet phonemic
          alphabet


          Speak (4):      Speak encountered phonemes

          Silent (5):             Do not speak encountered phonemes
          PARAMETERS:                             None

          DEFAULT:                                Off
          EXAMPLES:                               [:phoneme arpabet speak
          on]

          DESCRIPTION: When set, this command allows everything within
          square brackets to be interpreted as phonemic text. When
          phonemicizing text, simply put legal phoneme strings in square
          brackets. This will allow for the preferred pronunciation of a
          word or phrase.   This is an extremely  important function since
          it sets the characters "["  and "]" as phoneme delimiters. This
          means that when this command is set, all text and characters
          which appear between square brackets will be interpreted as
          phonemic text and will be pronounced as such. This is useful if
          you do not wish to turn this on but wish to have something  read
          phonetically. For example, to say the word red, simply embed the
          phonemic string [r'ehd] in the   Default for this mode is OFF.
          Note: It  is important to make sure that you close square
          brackets after phonemic text when this command is set.
          Otherwise, if normal text appears in square brackets, speech
          will sound garbled. Also, in previous versions of DECtalk,
          square brackets were nested. This is no longer the case. One
          square bracket is sufficient to close phonemic mode. It is
          sometimes useful to  begin a  text file with a closed square
          bracket ("]") to ensure that text will not be interpreted
          phonemically. Also, the command sequence  consisting of an open
          square bracket folowed by a colon ("[:") is always interpreted
          as the beginning of a voice command and cannot be interpreted
          literally.  Default for phonemic mode is OFF and it must be
          turned on by a special command.

          __________________________________________________________________




                                         35








          FUNCTION:                               PRONOUNCE NAME

          COMMAND:                                [:pronounce name]
          ESCAPE VALUE:

          OPTIONS:                                None
          PARAMETERS:                             None

          DEFAULT:                                Off
          EXAMPLES:                               [:pron name] (proper
          name)

          DESCRIPTION:
          This command takes the immediately following word and pronounces
          it as if it were a proper name.  First names, last names, street
          names and place names are all examples of proper names. This
          command can be used when DECtalk PC mispronounces a proper noun.
          See Mode Name to set this automatically.

          Note: This command must be used each time a new name is
          encountered but is useful when the location of a name field is
          known.


          __________________________________________________________________


          FUNCTION:                                      PUNCTUATION
          COMMANDS:                               [:punct XX]

          ESCAPE VALUE:                           204
          OPTIONS:

          None (0):       No puncuation spoken; all punctuation treated as
          text breaks
          Some (1): Standard DECtalk punctuation pronunciation

          All (2): All  punctuation is spoken.
          PARAMETERS:                             None

          DEFAULT:                                [:punc some]
          EXAMPLES:                               [:punc none]
          ESC P 0 ; 204 ; 0 z

          DESCRIPTION:
          There are three modes of punctuation pronunciation. (1) No
          punctuation is spoken; (2) only non clause-final puncuation is
          spoken; (3) all punctuation is spoken. The last function may be
          useful in proofreading as well as in those applications when
          special characters are encountered as in computer programs and
          the like. The default is that no punctuation is spoken.

          Note: When the [:punc none] command is used, no punctuation will



                                         36








          be pronounced. This will affect dollar amounts and similar
          sequences normally processed by special rules.


          ----------------------------------------------------------------------

          FUNCTION:                               RATE SELECTION
          COMMAND:                                [:rate dd]

          ESCAPE VALUE                            200
          OPTIONS:                                None

          PARAMETERS:                             Rate in words per minute
          DEFAULT:                                180 wpm

          EXAMPLES:
          [:rate 400]

          ESC P 0; 200 : 400 z
          DESCRIPTION:

          The speaking rate in DECtalk can be selected from 120 words per
          minute at the slowest end to 550 words per minute at the fastest
          end. In the command [:raDDD] D is a decimal number from 0 to 9.
          All values outside the range of 120-550 will default to the
          nearest legal value. Therefore, if you select a speaking rate of
          [:ra880] or 880 words per minute, it will default to the nearest
          legal value or 550 words per minute.
          ______________________________________________________________________


          FUNCTION:                               SAY MODE

          COMMAND:                                [:say X]
          ESCAPE VALUE:                            82

          OPTIONS:
          Clause (0)::    Speak on end of clause

          Word (1):               Speak on end of word
          Leter (2):              Speak on end of letter

          Line (3):               Speak on end of line
          PARAMETERS:                             None

          DEFAULT:                                [:say clause]
          EXAMPLES:                               [:say word]

                                                  ESC P 0 ; 82 ; 1 z
          DESCRIPTION:

          In DECtalk PC, each clause, word or letter can be spoken as
          entered.  In word and letter mode, DECtalk does not need to wait



                                         37








          for a clause terminator to begin speaking.  This command
          interacts with the speaking rate command so that you can set
          both speaking rate and speak word or letter mode for the optimal
          output. Word mode acts like letter mode except text is spoken a
          word at a time.  A white space or equivalent after a character
          or string of characters causes that string to be spoken.  This
          mode interacts with speaking rate such that you can increase or
          decrease the rate at which letters or words are spoken. For
          example, for rapid feedback of what is being typed, try letter
          mode with a rate of 400 wpm.  In clause mode, speaking starts
          when the PC is sent a clause terminator (period, comma, exclamation
          point, or question mark) followed by a white space. There is no
          time- out limit.  This is the normal mode where text is spoken a
          phrase, clause, or sentence at a time. This mode is the default
          mode. Say line CR or LF to a CRTL K

          Note: In [:say letter] mode, the "[" character will be spoken
          only after the next character is typed since DECtalk needs to
          know whether or not this is the beginning of a new command.
          __________________________________________________________________


          FUNCTION:                               RESUME

          COMMAND:                                [:resume]
          ESCAPE VALUE:                   13

          OPTIONS:                                None
          PARAMETERS:                             None

          DEFAULT:                                N/A
          EXAMPLES:                               [:resume]

                                                  ESC p 0 ; 13 z
          DESCRIPTION:

          This  command allows speech to be resumed where pause  left off.
          Any pending but unspoken text is retained, including index
          markers that may have been sent by the PC.  This command is
          asynchronous
          __________________________________________________________________


          FUNCTION:                               SYNCHRONIZATION

          COMMAND:                                [:sync]
          ESCAPE VALUE:                           11

          OPTIONS:                                None
          PARAMETERS:                             None

          DEFAULT:                                N/A
          EXAMPLES:                               [:sync]



                                         38








                                                  ESC P 0 ; 11 z

          DESCRIPTION:
          The application program can send data to DECtalk PC faster than
          DECtalk PC can speak it. If the user must carry on a dialogue
          with  the application program,  the  application program should
          know whether DECtalk PC has finished  speaking the text sent to
          it. SYNC provides this coordination  between the application program
          and DECtalk PC speech. When the PC sends SYNC, DECtalk PC
          finishes speaking  any pending text before processing the next
          command from the host  computer. Therefore, the user hears a
          message before any other  action starts.  Note that SYNC acts as
          a clause  boundary, the same as a comma, period, exclamation
          point, or  question mark. SYNC does not reply to the host
          computer when processing is  complete. However, you can arrange
          to get a reply by following the  DT_SYNC command with a
          DT_INDEX_QUERY command.
          __________________________________________________________________


          FUNCTION:                               TIMEOUT

          COMMAND:                                [:timeout D]
          OPTIONS:                                None

          PARAMETERS:                             Timeout in seconds
          DEFAULT:

          EXAMPLES:                               [:timeout 4]
                                                  ESC P 0 ; 402 ; 4 z

          DESCRIPTION:
          This command sets the buffer flush timeout value. See Flush.

          ______________________________________________________________________


          FUNCTION:                               TONE
          COMMAND:                                [:tone DD, DD]

          ESCAPE VALUE:                           401
          OPTIONS:                                None

          PARAMETERS:
          Frequency:      Tone Frequency in Hertz

          Duration:               Tone Duration in milliseconds
          EXAMPLES:                               [:tone 500,500]

                                                  ESC P 0 ; 401 ; 500 ;
          500 z
          DESCRIPTION:




                                         39








          This command generates sounds of different frquencies and
          lengths depending upon the parameters you set. This command
          allows you to make a wide variety of sounds for purposes such as
          notification, warning, and so on. Regular tones can also be used
          for a number of other purposes such as indications of margin
          bell, etc.  for screen reading applications. This may be useful
          when someone wishes to work in a quiet environment without using
          the speaker integral to the PC.

          __________________________________________________________________


          FUNCTION:                               VOICE SELECTION
          COMMAND:                                [:name X]

                                                  [:nX]
          ESCAPE VALUE:                   200

          OPTIONS:
          PAUL (0), BETTY (1), HARRY (2), FRANK (3), DENNIS (4), KIT (5),
          URSULA (6) , RITA (7), WENDY (8),  VAL (9).

          PARAMETERS:                             None
          DEFAULT:                                Paul voice

          EXAMPLES:
          [:name paul]

          [:np]
          ESC P 0 ; 200 ; 0 z

          DESCRIPTION:
          [:name X] is a command which allows voices to be changed to one
          of 9 hard- coded voices or to a special definable voice. X
          represents the mnemonic for one  of the voices: P = Paul, H =
          Harry, F = Frank, D = Dennis, B = Betty, U = Ursula, R = Rita, W
          = Wendy, K = Kit. and V = Val.  The default voice is Paul. The
          parameters of any one voice may be changed by the define voice
          [:dv] command.

          __________________________________________________________________


          FUNCTION:                               VOLUME
          COMMANDS:                               [:volume XX DD]

          ESCAPE VALUE:
          100: Volume Set

          101: Volume Up
          102:  Volume Down

          OPTIONS:



                                         40








          Set:    Set the volume to the desired level

          Up:     Increase the volume by the desired amount
          Down:   Decrease the volume by the desired amount

          PARAMETERS:                             Volume or delta volume
          DEFAULT:                                5

          EXAMPLES                                [:volume up 3]
                                                  ESC p 0 ; 101 ; 3 z

          DESCRIPTION:
          Volume in DECtalk PC can be controlled by a manual volume
          control on the speaker or by a software-controlled volume
          setting on the PC board. The commands control the setting on the
          PC board. It sets the speaker volume in increments from 0 to 99.
          Increments or decrements of 10 to 20 will provide for a perceptual
          increase or descrease in volume.  The default volume level is 5.
          [:volume set] is an absolute command; [volume up] and [volume
          down] are relative commands and increment or decrement the
          original value. The volume command is asynchronous.

          Note; It is not recommended that the normal operation be above a
          setting of 95. This would be excessively loud and also an
          overload condition could result. No damage will occur but the
          unit could go into overload protection mode. The audio output
          contains self-protection circuitry which guards against shorts
          or overloads. This will result in reduction or cessation of
          output and may take several seconds to recover after the short
          or excessive overload is encountered. This operation is normal.
          Note: The command [:lo] used in earlier versions of DECtalk is
          no longer used and has been replaced by the G5 parameter used
          with [:define voice]. To adjust the volume on DECtalk PC other
          than using the volume control knob on the speaker, it is
          recommended that you use the [volume] command.


          DECtalk PC has a great deal of added functionality. It is
          because of this that different commands may occasionally
          interact to produce unexpected results. For example, stting the
          [:punctuation none] commnand will disable all non- alphanumeric
          characters and therefore sequecnes such as $3.25 will not be pronounce
          correctly. Therefore, it will be necessary to remember whch
          commands have been enabled and make modifications accordingly.












                                         41
































































                                         42
































































                                         43
































































                                         44








                                              CHAPTER 4


                                   TEXT PROCESSING




          This chapter describes how DECtalk PC processes numbers,
          abbreviations, and acronyms, and how it decides whether a word
          is  pronounceable. It also includes suggestions for correcting
          spoken  output problems.

                                   TEXT PROCESSING RULES


          DECtalk PC processes text to be spoken by applying the following
          rules in this order.
                  1.      The input text stream is broken into groups of
          letters delimited by whitespace characters (spaces, tabs, or
          carriage returns).

                  2.      If the letter string is not already phonemic
          text and is to be converted, any understandable numbers are
          first  expanded to their word equivalents.
                  3.      Some abbreviations are expanded to their
          full-word equivalents. DECtalk PC uses a list of numeric
          abbreviations  and rules for a few special cases. The
          developer-definable dictionary cannot override this conversion.

                  4.      Each letter string is broken into pronounceable
          entities. Punctuation (including parentheses and quotation
          marks),  hyphenated words, and sequences that must be spelled
          out  are analyzed. Some abbreviations and acronyms are
          recognized, plus any entries from the definable dictionary.
                  5.      Any text that DECtalk PC recognizes as
          unpronounceable (for example, a sequence of letters containing
          no vowels) is  spelled out. DECtalk PC contains more intelligent
          strategies for upper-case initialisms (e.g., ABC, FBI and the
          like).

          A few rules operate on sequences of words. Interspersing
          phonemic  symbols or DECtalk PC commands will block these rules.
          Therefore, make sure that spoken  text is as contiguous as
          possible and keep breaks in structure  (from English spelling to
          phonemic transcription) to a minimum.
          The following terms are used below:

          Character:  Any of the printable ASCII characters, including
          letters, digits, and punctuation.
          Digit string:  A string of digit characters (0 through 9).
          DECtalk PC decides whether these should be  pronounced as
          numbers or independent characters.



                                         45








          Number: A string of characters (containing digits) that are
          processed as a group by DECtalk PC. For example, "123" is
          pronounced "one hundred and twenty- three," while "1(2)3" is
          pronounced  "one, two three" with punctuation disabled, ([:punc
          none]) or "one left-parenthesis two right-parenthesis  three."
          with punctuation enabled ([:punc some])..

                                     NUMBER PROCESSING


                  DECtalk PC recognizes seven general number classes, and
          a large  number of special cases and subclasses. The general
          classes are as  follows.
          Part numbers    Strings of mixed letters, digits, and the - and
          / characters.

          Cardinal numbers        The simple numbers that are used in
          counting.
          Examples include "123," "123,456," "12.345,"  "01234,"  and
          "12%."
          Ordinal numbers Simple strings of numbers with "st," "nd," "rd,"
          or "th" added, for example, "1st" and  "23rd."

          Fractions       Examples are "1//2," "2//3," and "44//100%."
          Money   Recognized by the presence of a dollar sign ($) or a
          pound sign (l--) as the first character  of a word.

          Dates   In the format (23-Sep-1983), and expandable into their
          English equivalent.
          Time of day   In the 24-hour format of some operating systems
          (11:04:03.02), and is spoken  in its English equivalent. The
          words "AM" and  "PM" are correctly processed after time  values.
          (Note that AM and PM contain no periods.)

                  Part Numbers
          A part number is defined as a string of mixed letters, digits,
          and  the - and / characters, containing at least one digit. The
          following are examples of part numbers.
                  DTC07-AM
                  MS-DOS V3.1
                  54-15966-01

          DECtalk PC first attempts to find the part number in the
          developer-definable and  fixed dictionaries. If it is
          unsuccessful,  it breaks the  part number into strings of
          letters, strings of digits, and  separators.
          A series of alphanumerics separated by / is spelled out.
          DECtalk PC correctly  speaks part numbers of the format XXX/YYY.

          A string of digits within a part number is spoken as follows.
                  1. If the digit string begins with 0 or is more than
          nine digits long, it is spelled out ("VS01" becomes "vee ess
          zero one").



                                         46








                  2. One or two-digit strings are spoken as normal
          cardinal  numbers ("PDP-11" becomes "pee dee pee eleven").

                  3. Three- or four-digit strings that end with 00 are
          spoken  as normal cardinal        numbers ("VT100" becomes "vee
          tee one  hundred").
                  4. Other three-digit strings are spoken as "digit, pair
          of digits" ("VT320" becomes      "vee tee three twenty").


                  5. Other four-digit strings are spoken as "pair, pair"
          ("DEC  2040" becomes "deck twenty forty"). Note that if the
          second pair begins with 0, it is pronounced "zero"("IBM  1401"
          becomes "eye bee em fourteen zero one").

          An alphabetic string is spoken as follows.
                  1. One- or two-character strings are spelled out
          ("VT100"  becomes "vee tee one hundred").

                  2. Longer strings are searched for in the
          developer-definable and fixed dictionaries. If they are not
          found, they are spelled out  ("DEC 2040" becomes "deck twenty
          forty").


          DECtalk PC cannot handle all possible part numbers perfectly.
          The  following examples of part numbers are inconsistent with
          DECtalk PC's  number and text processing algorithms.
                  CICS/VS Not a part number -- no digits.

                  net10000        "Net" is spelled out since it isn't in
          the dictionary.
                  1E-14   DECtalk PC will interpret this number as
          scientific
                          notation if [:mode math] is on.

          When processing numbers and number words, DECtalk PC first
          removes  leading and trailing punctuation. DECtalk PC translates
          "(123)" as  "one hundred and twenty-three."
                  Cardinal Numbers


          A cardinal number is a string of digits. If commas are included,
          they must break numbers into groups of three. For example,
          "123,456" is correct, but "1234,56" is not. The latter will be
          spelled out as "one two three four comma five six."
          Cardinal numbers may also include decimal fractions ("12.34")
          and  scientific notation ("12.34E56"). In scientific notation,
          the  exponent must be less than 100.

          A cardinal number preceded by + or - will be spoken as "plus" or
          "minus" whether or not [:mode math ...] is on . The notation 
          is pronounced as "plus or minus."



                                         47








          If the first digit is 0 ("01234"), the number will be spoken as
          a  string of digits as would be appropriate when reading postal
          zip  codes.

          If the number is greater than 999,999,999, it will be spoken as
          a  string of digits with pauses between each group of three
          digits.  If commas are provided, they will control the pause
          behavior. If  not, the output will pause after each group of
          three digits,  provided six or more digits remain. Therefore,
          "12345678901" will  be spoken as "123, 456, 78901" rather than
          "12,345,678,901."
          Four-digit numbers without commas are spoken in a variety of
          formats. For example, "5000" becomes "five thousand," while
          "1984"  becomes "nineteen eighty- four." This yields reasonable
          behavior  when processing years.

          Sometimes DECtalk PC does not understand the text well enough to
          pronounce the number correctly. Here are some examples.
                  oo      The telephone number "(617) 493-8255" will be
          spoken as
          "six hundred and seventeen, four ninety three dash eight  two
          five five." You can correct this by using one of the  following
          steps.

                  1. Spell out the digits as "six one seven, four nine
          three, eight two five     five" (notice   the commas to  make
          DECtalk PC pause at appropriate places).
                  2. Separate the digits with spaces and commas: "6 1 7,
          4 9 3,  8 2 5 5."

                  oo      The software cannot easily distinguish between
          "dash" and "minus."
                          How much is 10-15?

                          Bake this 10-15 minutes.
                          The [:mode math ...] option  determines whether
          the "-" is pronounced as "dash" or "minus." Neither will be
          pronounced if punctuation is disabled ([:punc none]).

          Some number formats are difficult to recognize out of context.
          For  example, the International Standard Date format (83.09.20)
          and the  United States telephone number format (noted
          previously) are  sometimes used by manufacturers for part
          numbers. These ambiguous  formats are not recognized by DECtalk
          PC and you must make such manual adjustments if you wish them to
          be pronounced in some special way.
          After a cardinal number, DECtalk PC recognizes a set of standard
          numeric abbreviations that are expanded to their English
          equivalent. These abbreviations are hardwired into DECtalk PC
          and  cannot be modified by the applications programmer. DECtalk
          PC  correctly generates singular and plural forms of these
          abbreviations. For example, 1 mm. is pronounced as "one
          millimeter" whereas 2 mm. is pronounced as two millimeters.



                                         48








          The numeric abbreviations recognized by DECtalk PC  after a
          cardinal number are listed below.  You can write them in either
          uppercase or  lower-case letters, but you must follow them by a
          period.

          Other abbreviations, such as "cc.," are spelled out by DECtalk
          PC.  The period that follows such an abbreviation is not
          pronounced  ("cc." becomes "see see") but terminates the clause,
          while the  period in number abbreviations does not terminate the
          clause.
                  Ordinal Numbers
          Ordinal numbers are formed from a string of digits (that may
          contain appropriate commas) followed by "st," "nd," "rd," or
          "th."  Ordinal numbers are also generated by DECtalk PC when
          fractions and  dates (in standard Digital format) are processed.

          DECtalk PC requires that the word portion of the ordinal number
          be correct. For example, "1st" will be processed correctly, but
          "2th"  will be pronounced "tuw tiy eych."
                  Fractions
          Fractions consist of one or two digits in the numerator, the /
          character, and one to three digits in the denominator. The
          numerator may range from 1 to 99, while the denominator may
          range  from 1 to 100. DECtalk PC correctly generates singular
          (1/3) and  plural (2/3) forms.

                  Money
          DECtalk PC assumes a digit string is money when it is introduced
          by  the currency symbols $ or .
          When the $ or  is recognized, DECtalk PC allows two forms of
          number  strings.

                  General digit strings have optional decimal fractions.
          $12.345. is pronounced "Twelve point three four five dollars ."
                  Digit strings are in dollars and cents (or pound and
          pence) format. $12.34  is pronounced "Twelve dollars and thirty
          four cents."

          DECtalk PC recognizes a number of quantity words (hundred,
          thousand,  million) that modify number processing if they
          immediately follow the money word. For example, "$1.23" million
          is pronounced "one  point two three million dollars."
                  Dates
          DECtalk PC recognizes dates written in Digital's standard date
          format, such as "23-Sep-1983," "23-Sep," or "23-Sep-83."
          However,  it will pronounce "Sep. 23, 1983" as "September
          twenty-three,  nineteen eighty-three."

                  Time of Day
          DECtalk PC recognizes the time of day when written in the format
          used  by Digital operating systems. Because this format can
          easily be  confused with part number formats, DECtalk PC does
          not try to convert  the digit string. Instead, it speaks the



                                         49








          string with appropriate  punctuation. Therefore, "12:00" becomes
          "twelve, zero zero,"  rather than "twelve noon."

          DECtalk PC correctly processes time values, including the
          fractional second value when it is present.
                  ABBREVIATIONS
          DECtalk PC recognizes, expands, and selects from a set of
          abbreviations taking into the account that the abbreviations may
          be ambiguous.

                  Abbreviations Processed by DECtalk PC
          In addition to the abbreviations that are recognized only after
          cardinal numbers, DECtalk PC recognizes two special cases, "Dr."
          and  "St." The pronunciation of these abbreviations depends on
          whether  the next word is capitalized.
                  If the next word is not capitalized or if there is no
          next word (the clause has ended), then "Dr." is  pronounced
          "drayv" and "St." is pronounced "striyt." The  next word must be
          on the same input line for the rule to  work correctly.

                  If the next word is capitalized, then "Dr." is
          pronounced "d'aaktrr and "St." is pronounced "seynt."
          Following these rules, DECtalk PC correctly pronounces "Doctor
          Dobbs  Drive" and "Saint Louis Street" in running text.
































                                         50








                  Abbreviations in the Built-In Dictionary


                  The following are some of the common abbreviations
          recognized by DECtalk PC (in alphabetical order); In dictionary
          lookups, upper case entries match uppercase only but lower case
          entries match either upper or lower case.


          Adm. Apr. Assoc. Aug. Av. Ave. Blvd. Bros. Ch. Cntr. Co. Comdr.
          Corp. Ctr. Dec. Dept. Dist. Feb. Flt. Fr. Fri. Ft. Gen. Gov.
          Inc. Intl. Jan. Jr. Jul. Jun. Ltd. Mar. Mfg. Mon. Mt. Nov. Oct.
          Pl. Pres. Prof. Rd. Rep. Rev. Rte. Sat. Sen. Sep. Sept. Sr. Sun.
          Thu. Thurs. Tue. Tues. Univ. Vol. Wed. asst. atty. bldg. cm.
          cms. cont. cu. deg. doz. e.g. esp. est. etc. ext. fig. fn. ft.
          gm. hrs. i.e. kg. kgs. km. lb. lbs. mg. mgs. misc. ml. mm. mr.
          mrs. ms. msde. msec. msecs. mss. nt.wt. op.cit. oz. ozs. p.p.d.
          pp. ppd. recd. secy. sq. tbsp. tbsps. tsp. tsps. vs. yds.


          If the abbreviation can be recognized by DECtalk PC during
          number  processing, then the English text form of the
          abbreviation is  spoken. Otherwise, the built-in dictionary form
          is spoken.  Dictionaries (developer-definable and fixed) are
          search in the order in which they are loaded.  The numeric
          abbreviations can be blocked by  including a dummy phonemic
          string, for example, "1 [ ]ft. 3."
          Dictionary entries that contain only uppercase letters, match
          text  words that contain uppercase letters in the same
          positions.  However, the entries that contain only lowercase
          letters, may  match text words that contain either lowercase or
          uppercase  letters in the same positions.

          "Apr." matches "APR." but not "apr." This is necessary to
          distinguish between words at the end of a sentence and valid
          abbreviations, such as "mar" (to damage) and "Mar." (for March).
          If a word in the above list is written with a terminating
          period, you  must include that period in the input text.
          Otherwise, it will not terminate  the current clause. For
          example, It weighed 3 kgs. will not terminate the clause, but
          It weighed 3 kgs.. (note the second period) will terminate the
          clause.


                  WORD SPELLOUT STRATEGIES

          After number processing, DECtalk PC must decide whether to
          pronounce  a string of characters as a single word or a compound
          word, or if  it must be spelled out. DECtalk PC uses the fixed
          and developer-definable dictionaries and a series of word
          transformations to make this  decision.

                  Word Spellout Tests



                                         51








          Number conversion, number abbreviations, and the "Street/Saint"
          test have all been performed before DECtalk PC begins the
          decision  tests. Punctuation has not yet been removed.

                  1.      DECtalk PC looks for the word in the
          dictionaries. Again, dictionaries are searched in the order in
          which they were loaded.   If the  word is found in any one of
          the dictionaries,  the search stops.)
                          The dictionary lookup procedure involves
          decomposing words into simpler forms by stripping affixes such
          as -ed and -ing. If the word is found in  the dictionary,
          DECtalk PC speaks the associated phonemic  transcription.

                  2.      If the word is not found, any punctuation around
          the word
          is removed. If present, the punctuation symbols " ( {{ <<  < [
          are removed from the front of the word, and the  punctuation
          symbols " ) }} >> > ] are removed from the end  of the word. The
          square brackets [ ] are already  discarded if the command
          [:phoneme arpabet speak on] has been given
                  3.      If some punctuation was removed, DECtalk PC
          performs a
          special test for abbreviations "(e.g., Gen., Gov.,)" and
          embedded  sentence punctuation ("I went (last year?) to
          school").

                  4.      Next, DECtalk PC looks for initialisms. (An
          initialism is a
          word written as a string of uppercase letters that may or  may
          not be separated by ".") For example, the string  "APO" is
          pronounced as "ey pee oh." Other strings with embedded periods
          may be spelled out.  If an initialism is recognized, the last
          "." will terminate the clause, unless it is followed by some
          other  punctuation.
                  DECtalk also looks for relatively short sequences
          (typically 3-4 letters) of all uppercase characters without
          periods and treats these as initialisms. If these are determined
          (by a special set of internal rules) to be non-pronounceable
          (e.g., ABC, FBI, IBM, etc.) then DECtalk will spell the string.
          This is a new feature of DECtalk.

                  5.      At this point, all diacritical marks are
          removed.
                  6.      If the word is still not found, it is examined
          for
          hyphenation (as in compound nouns) and the single-quote
          character. A test is also performed to make sure any word  or
          word fragment has enough consonants and vowels. If the  test
          fails, the word is spelled out.

                          This test makes sure that the word does not
          contain
          embedded punctuation. A word like "sys$system" is spelled  out



                                         52








          except when the command [:punc none] is given.

                  7.      If DECtalk PC decides the word is pronounceable,
          it
          processes each part of a compound noun independently. If  the
          word is not in the dictionary, it is processed by the
          letter-to-sound rules.
                  8.      If the word was pronounced, DECtalk PC examines
          the
          punctuation after the word for silence or clause  terminators.
          The punctuation marks " ) ] }} produce a  brief silence (only
          one silence is produced, even if  several characters are
          processed). The punctuation marks  ; : ! , . ? terminate a
          clause.

                  9.      If DECtalk PC decides that the word must be
          spelled out, the
          entire word is spelled, including left and right  punctuation.
          If the last letter of the word is a clause  terminator, it is
          considered punctuation and is not  spelled.
                  10.     A single letter, digit, or other character
          within quotes
          or parentheses is spelled out (but the punctuation isn't
          spoken). "A"" is pronounced "[ey]" rather than "uh."  This
          helps DECtalk PC process lists such as the following.

                          (a) books
                          (b) newspapers
                  11.     Brackets, parentheses, and braces act as commas,
          producing a clause boundary. Therefore, parenthetical
          expressions (such as this one) sound more natural.

                  12.     When text is spelled out, a brief pause is added
          after
          each character. This makes it easier to transcribe text,  such
          as part numbers.




















                                         53










                                     CHAPTER  5


                                     PHONEMICS AND VOICES




          DECtalk PC PHONEMIC INPUT

          This chapter describes the phonemic (sound) system of English
          used  in DECtalk PC and  the ways to control DECtalk PC's
          pronunciation.


          DECtalk PC represents the state-of-the-art in text-to-speech
          synthesis. The software shipped with DTC07-AM is the latest in
          DECtalk speech microcode. It contains a number of significant
          improvements over its predecessors. It will contain fewer
          pronunciation errors and will handle text in a more
          sophisticated way. It shold also sound more natural and
          intelligible. Naturalness in synthesized speech is a continuum
          which is evolving slowly because of the inherent complexity of
          accurately replicating human speech as well as the difficulty of
          adequately defining what naturalness itself means. However, the
          developer will find that the use of phonemic transcription
          should become less necessary with this added sophistication.


          PHONEMIC TRANSCRIPTION
          Most users do not need to know anything about DECtalk PC
          phonemic  input and may never need to use the phonemic alphabet.
          This is because improvements made in text-to-speech technology
          in the past few years make it unnecessary to have to modify an
          incorrect pronunciation of a normal word.  On the other hand,
          many developers will want to enter unusual words in the
          definable dictionary or, for various reasons, modify the sound
          of the synthesized speech, perhaps to attain a higher degree of
          naturalness, to demonstrate emotion,  or to  emphasize a particular
          word or phrase. In these cases, it may be helpful to understand
          in a bit more detail how DECtalk PC works.

          To understand how the DECtalk PC system works and to  make
          DECtalk PC correctly pronounce any English word, you may wish to
          know something about speech sounds and how to represent them on
          a  keyboard. Because spelling in English does not always show
          exactly  how words are pronounced, dictionaries use symbols to
          show how  words really sound. Sometimes these symbols are the
          same as  letters used in spelling. A word written the way it is
          pronounced is said to be in  phonemic transcription or simply in
          phonemics.



                                         54








          PRONUNCIATION ERRORS

          When DECtalk PC says a word or phrase incorrectly, you may need
          to  use phonemic input to get the desired pronunciation. The
          following  list suggests the most common types of errors that
          DECtalk PC makes,  and the best corrective action.
          Note: Prior to using phonemic transcription or clever
          misspellings, ascertain that DECtalk does indeed mispronounce
          the word.  in the vast majority of cases, the word will be
          pronounced correctly. You will need to turn phonemic mode on
          with a special command (above).

                  DECtalk PC mispronounces a proper name.
                          Lee Iacocca

                          Corrective action: Convert to phonemic form.
                          Lee [ayaxk'owkax]

                          Or misspell in a clever way.
                          Lee Eye a Coke a.

                  DECtalk PC mispronounces an acronym.
                          The UN building

                          Corrective action: Respell with spaces between
          the letters.
                          U N

                          Or use phonemics.
                          ['yuw 'ehn]

                  DECtalk PC mispronounces an unfamiliar word.
                          articulatory

                          Corrective action: Convert to phonemic form.
                          [aart'ihkyaxlaxtowriy]

                  DECtalk PC mishandles a letter string containing
          nonalphabetic charac                    ters.
                                  autoexec.bat
          readme.txt

                           Corrective action: Respell with inserted
          spaces.

                          auto exec dot bat
                          read me dot  text


                           Or convert to phonemic form.

                          ['aotowixgz`ekt*dat*b'aet]
                           [r'iyd*miy*dat*t'ehkst]



                                         55








                   DECtalk PC guesses incorrectly for an ambiguously pro
          nounced word.


                                  The insert
                                  Get the lead out.

                           Corrective action:  Convert to phonemic form

                           The ['ihnsrrt]
                           Get the [l'ehd] out.
                  DECtalk PC uses the wrong syntactic classification of a
          preposition or  particle.

                          He takes on tough jobs.   ("He does tough jobs"
          versus "He                               accepts graft when on
          tough jobs.")
                          Corrective action: Add a stress phoneme when
          needed.


                           He takes [']on tough jobs.

                           Or convert to phonemic form.


                           He takes ['aan] tough jobs.

                  DECtalk PC uses the wrong phrasing.


                          Following a long gasp shouts were heard.

                          Corrective action: Add commas or a verb-phrase
          introducer                              phoneme where needed.

                          Following a long gasp, shouts were heard.



                  INTRODUCTION TO PHONEMIC THEORY

           At one time long ago, English was pronounced as it was spelled,
          with each letter (or pair of letters) representing one sound.
          Because of historical sound changes such as the dropping  of
          sounds like the  gh of "bought" or the    k of  knight and word
          borrowing from other  languages, English pronunciation rules
          have become complex and  include many exceptions. For example,
          of is pronounced with a  v sound, while all other English words
          spelled with f  are pronounced with an    f  sound. The  vowel
          sequence ea can be pronounced in at least a half-dozen  ways, as
          illustrated by the sounds in the words  cheap, head,  earth, and
          idea. The letters th can be pronounced with a  voiceless phoneme



                                         56








          as in thin, or with a voiced phonemeas in   this; or the  th can
          represent the t  phoneme followed by the  h phoneme in compound
          words such as pothole.

          Some words have two pronunciations, for example,  read.  Correct
          pronunciation of a sentence such as Will you read the book or
          have you read it already? requires an understanding of the
          meaning of the sentence - a task which DECtalk is  learning to
          do. DECtalk can often correctly predict which of the alternate
          pronunciations is correct in a given context. However, because
          of the nature of language, it occasionally makes a mistake. If
          this occurs, you can get the alternate pronunciation in two
          ways.


                  By misspelling the word, e.g., "red" for "read"

                   By phonemic spelling: [r'ehd]


                  Ex. Will you read the book or have you [r'ehd] it
          already?


          Stress is an important part of phonemic representation. Stress
          alone distinguishes the two different pronunciations of words
          like  "insert."

          English words usually have one syllable that is spoken with more
          stress than the other syllables in the word. You can indicate
          this  primary stress to DECtalk PC by placing the phonemic
          symbol [']  before the vowel. The ['] symbol is described below.
           For example, the word "insert" can be spoken as a noun


                  "insert" = ['ihnsrrt]


           and as a verb
                   "insert" =  [ixns'rrt].



          Considering the complexity of English pronunciation rules and
          the  number of exceptions, it is not surprising that DECtalk
          occasionally  makes such pronunciation errors. You can adjust
          DECtalk pronunciation  through a large number of symbols,
          described in the rest of this  chapter.  Again, DECtalk V4.0
          has improved pronunciation rules and, as a result, such phonemic
          intervention will only occasionally be needed.


          PHONEMES



                                         57








           A phoneme is the smallest unit of speech that distinguishes one
          word from another. Of all the sounds that human beings can
          produce, relatively few are significant in any one language.
          Only  about 40 different functional sound types or phonemes are
          used in General  American English.

           The phonemes of English are not pronounced the same by every
          speaker. We all know people who pronounce some words differently
          from the way we do, yet we understand them. The differences may
          occur because we come from different parts of the country.
          Because  of these variations, there is no such thing as a universal
          standard pronunciation of American English. DECtalk PC attempts
          to mimic a  Midwestern (Northern Milwaukee) dialect.
           Because DECtalk PC pronounces a phoneme in a standard
          rule-governed  way, it is not possible to imitate all other
          English dialects  (although you can approximate some dialectal
          differences by  phonemic spelling).

           The following sections describe the vowel and consonant
          phonemes,  stress and syntactic symbols, and optional direct
          control of  intonation or singing.


                   VOWEL AND CONSONANT PHONEMES

           Linguists have identified about 17 vowel phonemes and 24
          consonant  phonemes for American English.  Tongue position (high
          versus low in the mouth, and front versus  back of the mouth)
          correlates with the frequencies of the two  lowest natural
          resonances of the vocal tract. The lowest resonance  frequency,
          is the  first formant F1 and the second formant  is F2.
          Consonant phonemes are typically described by their places of
          major articulatory  constrictions and the manners of forming the
          constrictions.

          Appendix B lists the  consonant and vowel phonemes of English as
          used by DECtalk. The symbols used for each phoneme are
          identified by a key word with the relevant phonemic sound in
          italics.
                   In many cases, phonemes are indicated by two letters,
          instead of  special characters or diacritic symbols that often
          appear in  dictionaries. DECtalk PC requires a case-insensitive
          representation  (uppercase and lowercase are acceptable)
          although lower case is the more commonly used. The letter pairs
          have  been designed so that it is not necessary to put a space
          between  phonemes of a word. In fact, the space indicates word
          boundaries.  DECtalk PC can parse input phonemic letter
          sequences to determine the  unique phoneme sequence in all
          cases.

                  Phonemes are enclosed in square brackets instead of
          between the  more traditional / symbol. The [ and ] characters
          mark the  beginning and end of phonemic mode clearly with



                                         58








          distinctively  different symbols. The input format is not
          strictly phonemic  because it also permits you to enter certain
          allophones (variants  of a phoneme), making the representation
          closer to a broad  phonetic transcription. When the command
          [:phoneme arpabet speak on] is given,, all text within square
          brackets is treated as phonemic text.


                   Phonemic Correction the Easy Way

          Developers may wish to learn the phonemic code. However, you can
          also consult one of the commonly available dictionaries to
          determine the phonemic pronunciation for the occasional word
          that DECtalk gets  wrong.
           For example, according to the Merriam-Webster Dictionary, the
          pronunciation of the word "Mozart" is


                  \'mot-,sart\


          Using the Table of Appendix B, you can convert this
          transcription to the DECtalk phonemic string
                  [m'owtsaart]


          The User Dictionary


          Every time DECtalk PC mispronounces a word in running text, your
          application could replace the text string ("Mozart")  with a
          phonemic string ([m'owtsaart]). However, if the number of words
          requiring phonemic translation in an application is small, it
          might be simpler to download a dictionary to DECtalk PC and let
          DECtalk PC perform the  replacement automatically.  DECtalk PC
          has memory allocated for a loadable dictionary. This dictionary
          is useful in cases where (a) DECtalk makes an error in
          pronounciation, or (b) the pronunciation of a string is unique
          to the application.  For example, if the sequence n/cl should be
          pronounced as not cleared, then a user dictionary entry is
          obviously needed.
                  To download a dictionary to DECtalk PC, you must do the
          following:

          1. Create a dictionary file using an editor. The dictionary must
          be in the following    format:
                  (a) an entry must start at the first character of the
          line Any character                                other than as
          the first character of the line causes the
          line to be treated as a comment and it will therefore not be
          processed.

                  (b) the syntax is grapheme string followed by phoneme



                                         59








          string. A line                           may be 256 characters
          long but not longer.

                  (c) A grapheme  (letter) string is comprised of legal
          graphemes. Legal                                 graphemes are:
          A-Z, a-z,  0-9 and select punctuation marks
          ("!, @, &, (, ), -, \,  and /). These characters may not be used
          at the beginning of the grapheme string. The grapheme string
          may be in either case.  Uppercase letters match only upper
          case; lowercase letters match either uppercase or lowercase.
                  (d) the phoneme string is comprised of legal phonemes.
          Phonemes are                                always in square
          brackets but may be in either upper or lower           case.

          For example,  to make the word coffee be pronounced tea, you
          would enter the following:
          coffee  [t'iy]

          After creating your dictionary file, you can compile and load
          the dictionary by doing the following:
          2. Compile the dictionary by typing:

                  userdic <input dictionary table> <output dictionary
          file>
          Input files have the default extension of .tab but can be
          anything. Output dictionary files have the extension of .dtu and
          must  have that extension for the loader to find the file
          correctly. If no output file is specified, a file with the same
          name and .dtu extension will be created for the output.

          For example: if your dictionary table is called mydict.tab,
          type:
                  userdic mydic

          3. Load the user dictionary by typing:
                  dt_load         <output file>

          For example, you would type:
                  dt_load  mydic.dtu

          Your customized dictionary is now loaded.
          Note: User dictionary lookups are done only on a single form of
          the word. No affix stripping occurs in user dictionary lookups.
          Therefore, inflected and derived forms must be entered
          separately.

          Warning: If your PC is powered down, you must reload the
          dictionary at power-up.
          Do not automatically assume that DECtalk will mispronounce a
          word, even a difficult one. DECtalk often correctly guesses at
          the correct pronounciation of even difficult or very complicated
          words. Also, using the [:pronounce name] command, will do a
          creditable job at proper names as well.



                                         60








          VOWELS


          While  DECtalk recognizes 17 vowel phonemes, these vowels can
          sometimes  change slightly when surrounded by certain phonemes.
          These  variants are discussed below.
          Vowel Allophones


                  Allophones for  Vowels + [r]
            The vowels in words  such as "beer," "bear," "bar," "bore,"
          and "poor" are different  from the available vowel phonemes in
          DECtalk. They require special  vowel- r allophones, which are
          listed below.

                  The Schwa Allophones [ax] and [ix]
            Another problem is with the unstressed  reduced vowel called
          "schwa" in English. The vowel appears in  words such as about
          and kisses. In kisses," the vowel is  produced with a higher
          tongue position, symbolized by the vowel  allophone [ix]. You
          can choose between [ax] and [ix]  by noting the characteristics
          of the adjacent phonemes, but  listening to the words will
          result in the best choice.

                  Syllabic Consonants
            The final syllable in words such  as "butter," "bottle," and
          "button" is usually symbolized in a  dictionary as consisting of
          a short vowel followed by a consonant.  For better sounding
          synthesis, DECtalk uses a set of syllabic  consonants, [rr],
          [el], and [en]  that are realized  without the short schwa.
          Syllabic "r" shares the same symbol as  the phoneme [rr] in a
          word such as "bird," but this leads to no  confusion inside
          DECtalk PC.

           The [em] allophone used in the earliest version of DECtalk no
          longer  exists and must be replaced by the two-phoneme sequence
          [axm] as  in the word "bottom" = [b'aataxm].
           In most situations, you do not need to be concerned about
          allophones because the vowel phonemes will be  automatically
          converted into the appropriate allophones by DECtalk PC  rules.
          For the developer, allophone selection can be  induced or
          blocked by using the syllable boundary phoneme [-] and  the
          rule- blocking phoneme [~] , or by  inserting allophone symbols
          in the phonemic spelling.


           CONSONANTS


           The symbols that represent consonants are straightforward. In
          one case, [hx] , the two-letter sequence ensures  unambiguous
          parsing because the letter "h" is part of  some vowel symbols.
           DECtalk PC speaks an English dialect that does not distinguish



                                         61








          voiced and voiceless w. Therefore, words like "which" and
          "witch"  are pronounced alike as [w'ihch].

           The letter "g" can be pronounced in two ways. In words like
          "gift," the consonant phoneme [g] is used. In words like "gin,"
          the phoneme [jh] is used.
           The letter sequence "th" can be pronounced with a voiceless
          sound  [th] as in "thin" or with a voiced sound [dh] as in
          "this."

          Consonant Allophones

           The consonants [t], [d], [r], and [l] may be replaced by
          special  allophones under certain conditions.

                  Dental Flap [dx]
            The [t] and [d] phonemes are often  replaced by a very brief
          tongue flap allophone [dx] when the  consonant phoneme appears
          between two vowels and the second vowel  is unstressed. DECtalk
          rules automatically insert this allophone in appropriate
          situations.

                  Glottal t
            The [t] phoneme may be replaced by a  glottalized [tx]
          allophone, especially in the word-final position  if the next
          word begins with a sonorant consonant. DECtalk rules  insert the
          allophone where appropriate.

                  Postvocalic [r]
            The [r] that appears after a vowel is  not as constricted as a
          word-initial [r]. DECtalk  automatically  selects this somewhat
          velarized allophone [rx] or an r- colored  diphthong where
          appropriate.

                  Postvocalic [l]
            The [l] that appears after a vowel may  sound different from
          the [l] in other contexts. For some speakers,  the tongue tip
          may not even reach the roof of the mouth. This  postvocalic
          allophone [lx] is automatically selected by DECtalk PC.

                  Glottal Stop [q]
            The glottal stop [q] is used in some  situations to indicate a
          word boundary, especially when the next  word begins with a
          vowel. Overuse of this symbol can lead to a  stilted style of
          speaking.


          Controlling Allophone Selection

            DECtalk automatically  inserts certain other allophones for
          [k], [q], and [nx] when  appropriate. It also selects the
          prevoiced and voiceless  unaspirated allophones of [b], [d], and
          [g]. You cannot access  these allophones.



                                         62








           If DECtalk does not select one of these allophones, you  can
          insert the allophone symbol directly in a phonemic
          representation of the word in question.

           If DECtalk PC uses one of these allophones inappropriately,
          place the  rule- blocking phoneme [~] before the phoneme in
          question to block  application of all allophonic substitution
          rules. For example, to  say "batter" without a flap being
          substituted for the [t], enter  the phonemic string [b'ae~trr].
          Silence Phoneme [_]
           DECtalk PC automatically inserts a silence (brief pause)
          whenever  punctuation appears in the text. The phonemic silence
          symbol [_]  is useful for controlling silence while in phonemic
          mode. Silences  and other pauses are described in more detail
          below.


           STRESS AND SYNTACTIC SYMBOLS


           Correct speech is more than simply stringing together a series
          of  words or phonemes. The meaning of a sentence is carried by
          the  words, plus rhythm, stress, and intonation (pitch change).
          You  recognize a question by the rising intonation of the voice,
          while  a statement is usually accompanied by falling intonation.
          A  speaker can give certain words in a sentence more importance
          by  adding stress (loudness, pitch and length) to them. Pitch
          often reveals  the emotional state of the speaker. For effective
          communication,  you need to consider these expressive features
          as well as the  segmental features of speech.
           As any good actor knows, punctuation alone is not enough to
          indicate the full meaning of a sentence. Some fine points of
          expression cannot be indicated by using phonemic symbols. Full
          control of the expression of a sentence is gained by directly
          changing the duration and pitch of words and phrases and by
          inserting pauses in the appropriate places.

           DECtalk PC uses stress and syntactic symbols to control aspects
          of  rhythm, stress, and intonation patterns. These symbols
          include  punctuation marks such as commas, periods, and
          exclamation marks.  Punctuation marks are recognized by DECtalk
          PC as indicating special  phrasing requirements. The following
          sections explain  how to improve the phrasing in DECtalk PC
          speech.


                                        STRESS AND SYNTACTIC SYMBOLS



                          Stress Symbols
                  '               Primary Stress




                                         63








                  `               Secondary Stress

                  '               Emphatic Stress
                  /               Pitch Rise

                  \               Pitch Fall
                  /\              Pitch Rise and Fall


                          Syntactic Symbols

                  -               Syllable Boundary
                  *               Morpheme Boundary

                  #               Compound Noun
                  (               Beginning of Prepositional Phrase

                  )               beginning of Verb Phrase
                  ,               Clause Boundary

                  .               End of Sentence
                  ?               End of Question

                  !               End of Exclamation
                  +               New paragraph





                  Primary Stress [']


           Most content words of English (nouns, verbs, adjectives, and
          adverbs) contain one primary stressed syllable. DECtalk PC
          represents  primary stress on a syllable with an apostrophe [']
          placed immediately before  the stressed vowel phoneme of the
          word as in the following  example for the word butter.


                  [bahtrr].
                   (No stress, flat intonation, too rapid.)

                   [baht'rr].             (Stress on the wrong syllable)


                   [b'ahtrr].             (Correct)



          You can also place the primary stress symbol between words, in
          which case it modifies the next word. For example, in the
          sentence  "He rang up the sale," DECtalk PC treats "up" as a



                                         64








          preposition  (without stress) instead of a particle. "Up" is
          correctly stressed  if you write the sentence as


                  He rang [']up the sale.

          There can be no space between a stress phoneme and a syntactic
          phoneme (for example, [']) and the following word.
          Secondary Stress [`]


           Use the secondary stress symbol [`] to indicate a degree of
          stress  that is between primary stress and unstressed. Secondary
          stress is  appropriate in the following cases.


          To highlight the next strongest syllable of polysyllabic  words,
          such as "demonstration."

                           [d`ehmaxnstr'eyshaxn].

                   On second parts of compound nouns, as in "answering
          machine.


                           ['aensrrixnx#maxsh`iyn].

                   In some very common words such as "I" and "we."



          DECtalk  realizes secondary stress by lengthening the vowel
          sound  more than unstressed (but less than primary stress). A
          pitch rise  may also occur on an early secondary stress . In
          most cases, you  can leave out the secondary stress symbol.


          Emphatic Stress ["]

          You can place the emphatic stress symbol ["] before any vowel to
          give emphasis to that syllable of the word. Good readers of
          English text understand the message of the sentence well enough
          to  pick out the most important word and emphasize it. DECtalk
          merely  pronounces words; it does not understand the sentences
          it is  saying. DECtalk cannot place emphasis on words to give a
          completely different meaning to the sentence unless you use the
          emphatic stress symbol. Here is an example.


                  Dennis loves Mary.
                   (Usual neutral pronunciation.)

                   [d"ehnihs] loves Mary or "["]Dennis loves Mary.



                                         65








                    Dennis -- not Frank -- loves Mary.)

                   Dennis loves [m"ehriy] or Dennis loves"["]Mary.
                    -- not Jill.)


          The exclamation point has a similar effect on the final stress
          of a sentence.


                  Help!



          Unstressed Syllables
           The English language contains a set of words that are either
          unstressed or have reduced stress., These are called syntactic
          function words and include the  following  types:


                  Prepositions (for, over)
                   Conjunctions (and, but)
                   Determiners (the, some)
                   Auxiliary verbs (is, has)
                   Pronouns (her, myself)
                   Clause introducers (which, that)



          These words have reduced stress in their dictionary entries. It
          is sometimes necessary to emphasize a function word that is
          stored  in DECtalk PC's dictionary without stress. You can do
          this by  including a primary stress symbol or an emphatic stress
          symbol in  the phonemic transcription as in the following
          example.


                  He went ['owvrr] (or [']over) the fence, not under it.
                   It was the fence that he went ['owvrr] (or [']over)

          .
          Pitch Control  [/], [\], [/\]

          DECtalk contains built-in rules to determine the pitch contour
          of  a sentence. While these rules are correct most of the time,
          you  can override them by placing the pitch rise [/], pitch fall
          [\],  and pitch rise and fall [/\] symbols before selected words
          (or  vowels if you want finer control).
           The [/] and [\] symbols must alternate, and the first symbol
          must  be a rise. Note that you can place both a rise and a fall
          on the  same syllable by using [/\]. You can hear the difference
          by trying  the following two sentences.




                                         66








                  It's a mad mad mad mad world.


                   It's a [/]mad [\]mad [/]mad [\]mad [/\]world.



          Word Boundary

          Any whitespace character (space, tab, or carriage return) in the
          text indicates a word boundary. DECtalk uses word boundary
          symbols  to select the word- beginning or word-ending allophone
          of a  phoneme.

           Some host computers automatically insert a carriage return into
          lines that are too long (and would go off the edge of the screen
          or paper). This may cause DECtalk PC to pronounce text
          incorrectly if  a carriage return occurs in the middle of a
          word. You can prevent  this problem by breaking long sentences
          with a carriage return at  an appropriate place.
          Syllable Boundary [-]


           DECtalk uses a set of rules to determine where words break into
          syllables, so consonants within words are assigned to their
          correct syllable. Use the syllable boundary symbol [-] to tell
          DECtalk PC where to assign the consonants within ambiguous
          words.  (This type of error rarely happens in DECtalk PC).


                  Example: oration

                   [ow-r'eshaxn] (DECtalk made an incorrect guess.)
                   [owr-'eshaxn] (Correct.)



          Morpheme Boundary [*]

                  English words are made up of meaningful units called
          morphemes.
          For example, "spell" has only one morpheme, while "misspelling"
          is  made up of three: "mis," "spell," and "ing."
           In most cases, the pronunciation of a word does not depend on
          morpheme boundaries. There are exceptions, however, in which
          case  the morpheme boundary symbol [*] can be used to force the
          correct  pronunciation. For example, "misspelling" should be
          pronounced  with a double "s"because each "s" belongs to a
          different morpheme.  Adding the morpheme boundary symbol
          improves  the pronunciation of the word.


                  misspelling.



                                         67








                   mixsp'ehlixnx (text to phoneme translation by DECtalk).
                   (The single "s" is too short.)


                   [mixs*sp'ehlixnx]
                   (Better.)



          Compound Noun [#]

           Compound words, such as rush-hour, coffee cup, Thermos bottle,
          answering machine, etc. should be spoken with less  stress on
          the second word. Also, words that were once compounds,  such as
          backache require decomposition for correct  pronunciation.

           DECtalk PC's dictionary includes an extensive list of compound
          words.  You can use the compound noun symbol [#] to correct
          compounds that  are not in the dictionary. For example, for
          "backache," type the  following phonemic transcription.


                  [b'aek#`eyk].



          Using a hyphen in compound words, for example, back-ache, or
          rush-hour  traffic" produces the correct pronunciation most of
          the time. You  rarely need the [*] and [#] phoneme symbols.


          Beginning of Verb Phrase [)]
           Moderately long declarative sentences are usually spoken as if
          they contain two units: a noun phrase and a verb phrase. There
          is  sometimes a slight pause between these two phrases, but
          there is also a slowing down at  the boundary, and the pitch
          tends to fall and then rise. DECtalk  searches for this
          syntactic boundary to change pitch. However, the  rarity or
          ambiguity of some verbs can cause confusion.


                  The old man in the chair was rocking slowly.
                   (Correct verb phrase detected.)

                   The old man in the chair sat rocking slowly.
                   (Verb phrase not detected; pure mechanical analysis of
                   the sentence does not show where "sat" belongs.)


                   The old man in the chair [)s'aet] rocking slowly.
                   (Phonemic correction.)





                                         68








          The right parenthesis [)] symbol is useful where a separation is
          needed between phrases but a comma is too strong. For example,
          you  can use [)] to indicate a dangling prepositional phrase.


                  She hit the man with the umbrella.
                   (The man carries the umbrella.)


                   She hit the man [)] with the umbrella.
                   (She uses the umbrella.)

          Past versions of DECtalk also used the [)] symbol for a second
          function to indicate alternate  pronunciations of words that are
          spelled the same but pronounced differently (homographs). (In
          DECtalk V4.0, this has been replaced by a slash "/".) For
          example, the word  "insert" is either a noun or a verb. As a
          noun, it is pronounced  ['ihnsrrt] and as a verb it is
          pronounced [ixns'rrt].   DECtalk will eventually do many of
          these alternate forms automatically.  For example it will
          eventually disambiguate the  following sentences:

                  He refused the produce.
                  He produced the refuse.


          As you can see, this takes a bit more intelligence to choose the
          correct pronunciation automatically.  Currently DECtalk will
          default to the more frequent pronunciation. If  this
          pronunciation is incorrect, simply place a slash at the beginning
          of the word  to obtain  the alternate pronunciation,.

                  The experienced secretary inserts more  /inserts  per
          hour.
          You can also phonemicize the word, e.g.,

                  The experienced secretary inserts more ['ihnsrrts] per
          hour.

          Note: Placing a slash at the beginning of a homograph to obtain
          the alternate pronunciation will work only if the [:punc some]
          command is enabled.


          Appendix  C  lists the pairs of common homographs that DECtalk
          PC  knows.


          Clause Boundary [,]
           When a sentence is composed of more than one clause, it should
          be  spoken in such a way that the listener can easily separate
          the  sentence into its component clauses. The comma [,] is the
          symbol  used to indicate clause boundaries. A comma in text and



                                         69








          a comma in  phonemic transcription have identical impact on the
          acoustic realization  of a sentence.

           Inserting a comma improves the quality of spoken sentences in
          the  following cases.


                          After an introductory prepositional phrase:

                           In particular cars cause pollution.
                           (Poor phrasing.)


                           In particular, cars cause pollution.
                           (Correct.)

                           Around a parenthetical remark:


                           A picture it seems is worth . . .
                           (Poor phrasing.)

                           A picture, it seems, is worth . . .
                           (Correct.)


                           In a list of more than two items:

                           They ate apples oranges and bananas.
                           (Poor phrasing.)


                           They ate apples, oranges and bananas.
          (Correct.)


                          After similar types of adjectives:
                          The tall, angular gentleman ...

                          Around phrases and clauses in a particularly
          long sentence






          Period [.]

           A sentence is usually a single, complete thought. It is also
          the  longest utterance that you can comfortably speak in one
          breath.  DECtalk inserts a pause when it finds a period that
          marks the end  of the sentence, duplicating the human speaker's



                                         70








          pause to take a  breath.

           The [.] symbol also tells DECtalk that a complete sentence has
          been sent and it is safe to begin speaking. In letter and word
          mode, DECtalk will speak immediately even if no period or comma
          has been seen.  DECtalk also tests each period to make sure it
          is  not part of a known abbreviation.
          Question Mark [?]


           The simplest way to indicate a question in English is by a
          rising  tone at the end of a sentence, although true question
          intonation  is not that simple and depends on the meaning of the
          question.
           There are many cases in English where a question (rising)
          intonation is not appropriate, even though the sentence ends
          with  a question mark. Rhetorical questions or quotations may
          contain a question  mark, but the speaker ends with a period
          (falling tone).  Sentences that begin with "wh" words ("who,"
          "what") usually end with a falling tone, even if they are
          questions.   DECtalk is smart enough to recognize "wh" questions
          and  speak them correctly.


                  Laura ate her broccoli?
                   (DECtalk PC asks a question.)


                   What time is it?
                   (DECtalk PC recognizes a wh-question and does not rise
          at the end).

























                                         71










          Exclamation Point [!]

           Exclamations are short statements spoken with special emphasis.
          DECtalk interprets an exclamation point to mean that the last
          stressed syllable in the sentence should have extra emphasis.


                  Stop!


          Long sentences ending with an exclamation point typically have a
          single word that receives extra stress. DECtalk PC has no way of
          knowing which word to stress and chooses the last word by
          default.  Use the emphatic stress symbol ["] to emphasize a
          different word  when the last word is not appropriate.


                  Joan won the marathon!
                   (DECtalk PC emphasizes the last word.)

                   ["] Joan won the marathon.
                   (Correct.)



          New Paragraph [+]


           The new paragraph phoneme [+] should be inserted in text
          wherever a new thought has begun   (DECtalk does not do this
          automatically because there is no  standard new paragraph
          indicator in general text - the tab is  used in too many other
          ways.)
           The new paragraph phoneme [+] modifies the intonation contour
          and  adds variety to running text. The first sentence of a new
          paragraph is produced with a higher, more lively fundamental
          frequency. DECtalk will also pause longer between paragraphs to
          give the listener an indication of a change of topic.

           [+] This paragraph has the [+] phoneme inserted in the
          appropriate  place. The new paragraph symbol can be used in
          other situations,  such as to help indicate the start of a new
          mail message in a list  of mail messages.


          DIRECT CONTROL OF DURATION AND PITCH

           Displaying the correct emotion through voice alone is a
          difficult  task, as any radio actor will tell you. The best
          method is to  experiment with phonemic symbols until you achieve
          the quality you  want. Emotional content is usually connected to



                                         72








          the sentence  content, so varying both together is the best way
          to convey  feelings.

           For example, you can have DECtalk say a simple phrase like
          "Good  morning" in several different ways.


                  Good morning.
                   (Normal tone.)

                   Good morning!
                   (Emphatic.)


                   Good morning?
                   (Questioning.)

                   [g"uhd] morning.
                   (Emphasize "good.")



          If these alternatives do not produce what you need, you can use
          direct prosodic control. You must represent the entire sentence
          phonemically,  specifying a duration for each phoneme that does
          not  match the natural model. You should also give some or all
          phonemes  specific target pitch values. DECtalk PC will compute
          smooth  transitions between pitch values, where the specified
          pitch is  reached at the end of the phoneme.


          Duration and Pitch [<>]


           DECtalk PC uses angle brackets [<>] to enclose duration and
          pitch  values of phonemes.
           The format is


                  <duration,pitch>



          where duration is the length of the phoneme in milliseconds (ms)
          and pitch is the fundamental frequency of the phoneme in hertz
          (Hz).

           Any phoneme may be followed by angle brackets to alter the
          default  duration and pitch. If either value is omitted, or
          specified as 0,  the default value is used. The values for
          duration and pitch are  separated by commas.





                                         73








                  [ow]
                   (Normal phonemic specification.)


                   [ow<1000>]
                           (1,000 ms duration.)

                   [ow<,90>]
                   (Default duration, 90 Hz pitch at end.) (note the
          position of the comma)


                   [ow<1000,90>]
                   (1,000 ms duration, 90 Hz pitch at end.)



          For example, to say "Oh?" with a greater degree of skepticism
          than  DECtalk PC normally imparts, you could type


                  [_<,90>ow<400,150>].



          The [ow] phoneme begins at 90 Hz and ends (after 400 ms) at 150
          Hz.
           Note the use of the silence symbol [_] in the example just
          given.  Pitch and duration values must always be attached to a
          preceding  phoneme. The silence symbol is used so that the value
          (90 Hz in  this example) is applied to the beginning tone of the
          next spoken  phoneme [ow].

           Many of the phonemes (all except the stop  consonants p, t, k,
          b, d, and g) can be sustained in a monotone  for an arbitrarily
          long duration by using direct prosodic control.  For example, to
          sustain "ah" for a duration of 10 seconds (10000  ms) at a pitch
          of 120 Hz, type


                  [_<,120>ah<10000,120>].
                   (Produces "ahhhhhhh . . .")



          To produce a prolonged sigh, type


                  [_<100,150>ah<2500,80>].



          where the silence phoneme causes the pitch contour to start at



                                         74








          150  Hz at the beginning of the "ah" and end at 80 Hz at the end
          of the  "ah."


          Singing


          Singing uses different voice control techniques than
          conversation.  Even untrained singers add liveliness to the sung
          notes by varying pitch slightly, a quality called vibrato.
          Singing in DECtalk would sound mechanical without vibrato.
           Each word or syllable is defined phonemically. The first number
          following a phoneme is the duration in milliseconds, and the
          second number is the pitch in Hertz. Vowels and consonants not
          assigned a pitch remain at the same pitch as preceding segments.
          You can intersperse silence phonemes if you wish.

           DECtalk stays exactly on pitch when the pitch is specified in
          Hertz (Hz). You can add vibrato (to give a more realistic
          singing  quality) by specifying notes with pitch values from 1
          to 37. Note  1 is C2 and 37 is C5 on an equal tempered scale (A4
          = 440 Hz) as  shown below.  C2 is the second C below middle C on
          a piano,  C4 is middle C, and so on.
          An added feature of DECtalk PC is the ability to specify notes
          by their coded value equivalent (below). The coded value is
          simpler to write and is the exact equivalent of the pitch in
          Hertz.

           When notes are specified, DECtalk PC reaches the desired pitch
          within  about 100 ms after the start of the phoneme and adds
          vibrato after  changing to this pitch. When you give a specific
          non-sung pitch,  DECtalk PC reaches the pitch target at the very
          end of the phoneme  with no vibrato. The following example makes
          DECtalk PC "sing" the  first four notes of Beethoven's Fifth
          Symphony.


                   [d<100,17>aa<400> d<100,17>aa<400>]
                   [d<100,17>aa<400> d<120,13>aa<700>].


          The following table contains the pitch values which can be used
          to allow your DECtalk PC to sing. You may use either the number
          in Hz or the coded value.

          Coded   Note    Pitch (Hz)
          Value

          1       C2      65
          2       C#      69

          3       D       73
          4       D#      77



                                         75








          5       E       82      B

          6       F       87      A
          7       F#      92      S

          8       G       98      S       B
          9       G#      103     |       A

          10      A       110     |       R
          11      A#      116     |       I

          12      B       123     |       T
          13      C3      130     |       O       T

          14      C#      138     |       N       E
          15      D       146     |       E       N

          16      D#      155     |       |       O
          17      E       164     |       |       R

          18      F       174     |       |       |      A
          19      F#      185     |       |       |      L

          20      G       196     |       |       |      T
          21      G#      207     |       |       |      O

          22      A       220     |       |       |      |
          23      A#      233     |       |       |      |

          24      B       247     |       |       |      |       S
          25      C4      261     |       |       |      |       O

          26      C#      277     |       |       |      |       P
          27      D       293     |       |       |      |       R

          28      D#      311     |       |       |      |       A
          29      E       329     |       |       |      |       N

          30      F       348             |       |      |       O
          31      F#      370             |       |      |       |

          32      G       392             |       |      |       |
          33      G#      415                     |      |       |

          34      A       440                     |      |       |
          35      A#      466                     |      |       |

          36      B       494                     |      |       |
          37      C5      523                            |       |







                                         76








          Note: C4 is middle C























































                                         77
































































                                         78
































































                                         79








                                                  CHAPTER 6



                                        MODIFYING THE VOICES



          This chapter shows how to change and create voices spoken by
          DECtalk PC. Changing and creating voices requires a certain
          knowledge  of acoustic phonetics and the human voice. This
          chapter  necessarily goes into detail on speaker- definition
          parameters.  Understanding the information is necessary for
          effective voice  modification and will make the task easier.
                  VOICE CHARACTERISTICS

          The DECtalk PC has a set of simple commands that you can  use to
          change the speaking rate, or to change the voice to one of  nine
          different voices -- male, female, child, or a
          developer-definable  voice - as shown below. You can use other,
          more complex  commands to modify the characteristics of each
          voice, or to create  a new voice or special effects. The complex
          commands require skill and experience to use effectively, but
          the simple commands are  easy to use in normal DECtalk PC
          applications.
          We can usually tell whether the voice of  a stranger at the
          other end of a telephone line is that of a man, woman, or child.
          Slight differences  in voice quality are characteristic of these
          different speakers.  For example, women's and children's voices
          are usually higher  pitched than men's voices.  The size of the
          head and length of the vocal tract  account for some of the
          differences. We also notice that some  people speak more quickly
          or more distinctly than others.

          Chapter 5 described ways in which  DECtalk pronunciation could
          be modified.  This chapter shows how voice characteristics
          themselves can be changed by selecting the speaking rate, sex,
          and other voice parameters.
          DECtalk has a number of commands which can be used to modify
          voice  characteristics. Because the commands are entered  within
          phonemic brackets [ and ], you must have the [:phoneme arpabet
          speak] command set to ON.  This option is set OFF at power-up.


          NOTE:  DECtalk interprets  text between square brackets as
          phonemes only when the [:phoneme arpabet speak on]  command must
          be sent. For DECtalk to interpret  the [ and ] and the
          characters between them literally,  [:phoneme arpabet speak off]
          command must be sent.

          WARNING: If the command [:phoneme arpabet speak on]   is set and
          you forget the final "]", DECtalk PC will try to  interpret
          ASCII text as phonemes,  skipping over illegal letter



                                         80








          combinations.  The resulting text will appear to sound garbled.
          To recover,  close phonemic mode by typing "]".

          The commands that modify the voice characteristics are
                  1.      Speaking rate [:ra _]

                  2.      Comma pause duration [:cp _]
                  3.      Period pause duration [:pp _]

                  4.      New voice [:n_]
                  5.      Design voice [:dv _]

          where the "_" represents a variable letter, value, or parameter.
          Each of the first three commands has a single, simple function.
          The fourth (new voice) command selects the standard DECtalk
          voices. The fifth (design voice) command allows you to create a
          completely new voice.

                  SPEAKING RATE [:ra _]
          The default speaking rate is 180 words per minute (wpm).
          Speaking  rate values have been calibrated with a 300-word
          standard  paragraph using Fairbanks, G. Voice and Articulation
          Drill Book. Second  Edition. Harper and Row, 1960, p. 114.
          Speaking rates can be adjusted to be very slow, very  fast, or
          anywhere in between by using the following commands:

          [:ra 120]       This rate is the slowest, 120 wpm. It is ideal
          for situations where material such as a phone number is  to be
          copied down by the listener.
          NOTE: It may be frustrating to listen to extended speech at slow
          rates unless the listener is actually copying down every word.

          [:ra 160]       This rate is moderate, 160 wpm. This rate sounds
          a little slow, but may be preferred in certain situations.
          [:ra 180]       This rate is normal (moderately fast), 180 wpm.
          It is the default rate for DECtalk PC, and is ideal for
          listening to continuous text under optimal  conditions.

          [:ra 240]        This rate is faster, 240 wpm. Practiced
          listeners can skim material at this rate and prefer it when
          scanning text for important sections. Inexperienced  listeners
          may not understand every word at this  rate.
          [:ra 350]       This rate is very fast,  350 wpm. It is too fast
          for many  people to follow, but it does have  applications in
          special circumstances.

          [:ra 550]       This rate is the fastest,  550 wpm. It is too
          fast  for many  people to follow, but it does have  applications
          for unsighted individuals who wish to scan text quickly. This
          rate is 200 wpm faster than any previous version of DECtalk.
          Any speaking rates between 120 and 550 are permitted in the [:ra
          _] command. Rates specified outside this range are limited to
          the  nearest legal value.



                                         81








          Changes in speaking rate influence the duration and especially
          the  number of pauses in text, as well as the duration of
          individual  phonemes. At rates below 140 wpm, DECtalk PC inserts
          pauses at all  phrase boundaries and pauses and phonemes near
          the ends  of phrases are lengthened considerably.. At rates
          faster than 240 wpm, DECtalk PC deletes  the comma pause. and
          other  pauses and phonemes are shortened. (Near the beginning of
          phrases,  phonemes are fairly short at both slow and fast
          speaking rates.)

                   PAUSE DURATIONS [:pp _] and [:cp _]
          At the normal speaking rate of 180 words per minute, DECtalk PC
          pauses about half a second after a period in the text and about
          a  sixth of a second after a comma. These pause durations are
          adjusted appropriately when you change the speaking rate.

          Speech Command Parameters
          Command Minimum Maximum Unit per Parameter


          :ra             120             350            Words per minute
          :cp             -40             30000          Milliseconds
          :pp             -380            30000          Milliseconds
          :n_             NA              NA             pbhfkrudwv
          :dv             --              --             See Appendix D

          In some situations, you might like a pause after a period
          without changing the speaking rate. For example, to get DECtalk
          to read a  list of words at a normal rate with 5-second pauses
          after each word (to allow the listener to write them down), you
          could use one  of the following commands and end each word with
          a comma  (continuation rise intonation) or a period (falling
          intonation).

          [:pp 4500]      Add a period pause of 4500 ms (4.5 seconds) to
          the standard
          half-second pause that occurs after a period in the  text. The
          total pause between words will be about 5  seconds.
          [:cp 4800]      Add a comma pause of 4800 ms (4.8 seconds) to
          the standard sixth
          of a second pause that occurs after a comma in the  text at
          normal speaking rate. The total pause  between words separated
          by a comma will last about 5  seconds.

          [:pp 0 :cp 0]   Reset the period pause and comma pause to their
          normal default values.
          The permitted range for a period pause is from -380 to 30000 ms.
          A negative value shortens the standard period pause. The
          permitted  range for a comma pause is from -40 to 30000 ms.
          Values specified  outside this range will be limited to the
          nearest legal value.

                  SELECTING A STANDARD VOICE [:n_]



                                         82








          DECtalk PC has nine built-in voices and one voice that is
          definable. You can refer to each voice by the command [:n_]
          where  "_" is a letter representing one of the DECtalk voices.
          The values of n are p=paul, h = harry, f = frank, d = dennis, b
          = betty, u = ursula, r = rita, w = wendy, k = kit and v = val.

          You can change voices with the new voice command as in this
          example.
                  [:nb] Hello. I'm Betty.

          You can also change voices in the middle of a sentence.
                  [:np] This is a demo [:nb] of a sudden change in voice.

          If a voice change request occurs in the middle of a sentence,
          DECtalk PC will automatically pause very slightly. The pause is
          the equivalent  of inserting a comma before the mid-sentence
          command. For example,  you could type the previous sentence as
          follows.
                  [:np] This is a demo, [:nb] of a sudden change in voice.

          Such a pause in DECtalk PC, however, is barely noticeable.
          Nevertheless, it is good practice to always end a sentence
          (insert a period)  before changing voices. This allows the
          listener to prepare for a  new speaker.


                  DESIGNING A VOICE WITH SPEAKER-DEFINITION

                  PARAMETERS [:dv _]


          The DECtalk PC voices provide an adequate selection for most
          developer's  applications. However, if you have a special
          application requiring  a monotone or unusual voice, you can
          modify the parameters defined in this  section on a trial-
          and-error basis to get the desired voice.
          The nine built-in voices of DECtalk are distinguished from one
          another by a large set of speaker-definition parameters.

          Speakers can differ in sex, age, head size and shape, larynx
          size  and behavior, pitch range, pitch and timing habits,
          dialect, and  emotional state. DECtalk PC cannot approximate all
          of these options.  Therefore, the space of distinguishable
          voices is quite limited,  even though DECtalk PC has many
          speaker-definition parameters that  can be modified.
          The design voice [:dv _] command introduces the
          speaker-definition  parameters that can be entered as a string
          or one at a time.

          The following sections discuss speech production, acoustics, and
          perception. Some of the information is relatively technical,
          but the examples should make it possible for all developers to
          effectively modify any parameter and listen to the results.



                                         83










                  Changing Sex and Head Size

          Six speaker-definition parameters control the size and shape of
          the head. These parameters are listed below and are described in
          the chapter on modifying voices.

                  sx      Sex 1(male) or 0 (female)
                  hs      Head size, in percent
                  f4      Fourth formant resonance frequency, in Hz
                  f5      Fifth formant resonance frequency, in Hz
                  b4      Fourth formant bandwidth, in Hz
                  b5      Fifth formant bandwidth, in Hz
                          Sex, sx

                   Male and female voices have many differences, including
          head size, pharynx length, larynx mass, and speaking  habits
          such as degree of breathiness, liveliness of pitch, choice  of
          articulatory target values, and speed of articulation. Some of
          these differences are under the control of a single parameter,
          sx,  the sex of the speaker. Speakers Paul, Harry, Frank, and
          Dennis  are male (sx = 1), while speakers Betty, Rita, Ursula,
          Wendy, and  Kit are female (sx = 0). Actually, Kit the Kid can
          be male or  female because children younger than 10 years old
          have similar  voices for both sexes.
          Changing the sx parameter causes DECtalk PC to access a
          different  (male or female) table of target values for formant
          frequencies,  bandwidths, and source amplitudes. The male and
          female tables are  patterned after two individuals who were
          judged to have pleasant,  intelligible voices. DECtalk PC's
          built-in voices are only scaled  transformations of Paul and
          Betty, the two basic voices.

          You can change the sex of any of DECtalk PC's voices by making
          the  voice current and then modifying the sx parameter. For
          example,  the following command gives Paul some of the speaking
          characteristics of a woman. (The sx parameter does not change
          the  average pitch or breathiness, so a peculiar combination of
          simultaneous male and female traits results from this sx
          change.)
                  [:np :dv sx 0] Am I a man or woman?

          The sx parameter can also be specified as m or f with the
          commands  [:dv sx m] or [:dv sx f].
          WARNING: If you change the sex of the voice, some phonemes may
          cause DECtalk PC's filters to overload, producing a squawk. (The
          squawk is unpleasant, but it will not damage DECtalk PC.) The
          modification of certain parameters such as F4, F5, and G1
          (explained below) can help to to correct this problem.

                          Head Size, hs
                   Head size is specified as the average size for an adult



                                         84








          man (if sx = 1) or an adult woman (if sx = 0). A  head size of
          100 percent is normal or average for a given sex, but  people
          can differ quite a bit in this characteristic. Head size  has a
          strong influence on a person's voice. Large musical  instruments
          produce low notes, and humans with large heads tend to  have
          low, resonant voices. For example, to make Paul sound like a
          larger man with a 15 percent longer vocal tract (and formant
          frequencies that are scaled down by a factor of about 0.85
          percent), type the following command.

                  [:np :dv hs 115] Do I sound more like Huge Harry this
          way?
          Head size is one of the best variables to use if you want to
          make  dramatic voice changes. For example, Paul has a head size
          of 100,  while Harry's deep voice is caused in part by a head
          size change  to 115, or 15 percent greater than normal.
          Decreasing head size  produces a higher voice, such as in a
          child or adolescent.  Extreme changes in head size, as in the
          following examples, are  somewhat difficult to understand.

                  [:nh :dv hs 135] Do I have a swelled head?
                  [:nk] I am about 10 years old.

                  [:nk :dv hs 65] Do I sound like a six year old?
          WARNING: Extreme changes in head size can cause overloads, as
          well as difficulties in understanding the speech.  The
          modification of certain parameters such as  F4, F5,  and G1
          (explained below) can help to correct this problem.

                   Higher Formants, f4, f5, b4, and b5
           A male voice typically has five prominent resonant peaks in the
          spectrum (over  the range from 0 to 5 kHz), a female voice
          typically has only four  (due to a smaller head size), and a
          child has three. If fourth and fifth formant resonances exist
          for a particular voice, they are  fixed in frequency and
          bandwidth characteristics. These  characteristics are specified
          by the parameters f4, f5, b4, and  b5, in Hz. Values for each
          predefined voice are given below.

          If a higher formant does not exist, the frequency and bandwidth
          of  the speaker definition are set to special values that cause
          the  resonance to disappear. To make a resonance disappear, the
          frequency is set to 2500 Hz, and the bandwidth to 2048 Hz. This
          is  what has been done to the fourth and fifth formants for Kit
          the  Kid.
          The permitted values for f4 and f5 have fairly complicated
          restrictions. Violating these restrictions can cause overloads
          and  squawks. The restrictions are listed below for cases where
          a  higher formant exists.

                  1.      F5 must be at least 300 Hz higher than f4.
                  2.      If sx is 1 (male), f4 must be at least 3250 Hz.




                                         85








                  3.      If sx is 0 (female), f4 must be at least 3700
          Hz.

                  4.      If hs is not 100, the above values should be
          multiplied by (hs                       / 100).
          These higher formants produce peaks in the spectrum that become
          more prominent if b4 and b5 are smaller, and if f4 and f5 are
          closer together. The limits placed on b4 and b5  should  ensure
          that no problems occur. However, smaller values for  bandwidths
          may produce an overload in the synthesizer. You can  correct
          these overloads by increasing the bandwidths or by  changing the
          gain control g1  (below).


                  Changing Voice Quality


          Six speaker-definition parameters control aspects of the output
          of  the larynx, which, in turn, control voice quality. These
          parameters are listed below.
                  br      Breathiness, in decibels (dB)
                  lx      Lax breathiness, in percent
                  sm      Smoothness, in percent
                  ri      Richness, in percent
                  nf      Number of fixed samples of open glottis
                  la      Laryngealization, in percent

                   Breathiness, br
          Some voices can be characterized as breathy. The vocal folds
          vibrate to generate voicing and breath  noise simultaneously.
          Breathiness is a characteristic of many  female voices, but it
          is also common under certain circumstances  for male voices.

          The range of the br parameter is from 0 dB (no breathiness) to
          70  dB (strong breathiness). By experimenting, you can learn
          what  intermediate values sound like.  For  example, to turn
          Paul into a breathy, whispering speaker, type the  following
          command.
                  [:np :dv br 55 gv 56] Do I sound more like Doctor Dennis
          now?

          This voice is not as loud as the others due to the simultaneous
          decrease in the gain of voicing, gv, but it is intelligible and
          human sounding.
                    Lax Breathiness, lx

          The br parameter creates simultaneous breathiness whenever
          voicing is turned on.  Another type of breathiness occurs only
          at the ends of sentences  and when going from voiced to
          voiceless sounds. This type of "lax"  breathiness is controlled
          by the lx parameter in percent.
          A non-breathy, tense voice would have lx set to 0, while a
          maximally breathy, lax voice would have lx set to 100. The



                                         86








          difference between these two voices is not great, but you can
          hear  it if you listen closely.

                   Smoothness, sm
          Smoothness refers to vocal fold vibrations. The vocal folds meet
          at the mid-line, as they do in  normal voicing, but they do not
          slam together forcefully to create a very sudden cessation of
          airflow.

          DECtalk PC uses a variable-cutoff, gradual low-pass filter to
          model  changes to smoothness. The range of sm is from 0 percent
          (least  smooth and most brilliant) to 100 percent (most smooth
          and least  brilliant). The voicing source spectrum is tilted so
          that energy  at higher frequencies is attenuated by as much as
          30 dB when  smoothness is set to a maximum, but is not
          attenuated at all when  smoothness is set to 0.
          Professional singing voices that are trained to sing above an
          orchestra are usually brilliant, while anyone who talks softly
          becomes breathy and smooth. To synthesize a breathy voice, an sm
          value of about 50 or more is good. Changes to sm do not have a
          great effect on perceived voice quality.

                   Richness, ri
          Richness is similar to smoothness and brilliance, except that
          the spectral change occurs at lower  frequencies, and is due to
          a different physiological mechanism.  Brilliant, rich voices
          carry well and are more intelligible in noisy environments,
          while smooth soft voices sound more friendly.  For example,
          typing the following command produces a soft, smooth  version of
          Paul's voice.

                  [:np :dv ri 0 sm 70] Do I sound more mellow?
          The following command produces a maximally rich and brilliant
          (forceful) voice.

                  [:np :dv ri 90 sm 0] Do I sound more forceful?
          Smoothness and richness are usually negatively correlated when a
          speaker dynamically changes laryngeal output. The sm and ri
          parameters do not influence the speaker's identity very much.

                   Nopen Fixed, nf
          The number of samples in the open part of the glottal cycle is
          determined not only by ri, but also by a second parameter, nf.
          nf is the number of fixed samples in the  open portion of the
          glottal cycle.

          Most speakers adjust the open phase to be a certain fraction of
          the period, and this fraction is determined by ri. Other
          speakers  keep the open phase fixed in duration when the overall
          period  varies. To simulate this behavior, set ri to 100 and
          adjust nf to  the desired duration of the open phase. The
          shortest possible open  phase is 10 (1 ms), and the longest is
          three quarters of the  period duration (about 70 for a male



                                         87








          voice).

                   Laryngealization, la
          Many speakers turn voicing on and off irregularly at the
          beginnings and ends of sentences, which  gives a querulous tone
          to the voice. This departure from perfect  periodicity is called
          laryngealization or creaky voice quality.

          The la parameter controls the amount of laryngealization in the
          voice. A value of 0 results in no laryngealized irregularity,
          and  a value of 100 (the maximum) produces laryngealization at
          all  times. For example, to make Betty moderately laryngealized,
          type the  following command.
                  [:nb :dv la 20]

          The la parameter creates a noticeable difference in the voice,
          although it is not altogether a pleasant change.
                  Changing the Pitch and Intonation of the Voice


          Seven speaker-definition parameters control aspects of the
          fundamental frequency (f0) contour of the voice. These
          parameters  are listed below and are described in the chapter on
          modifying voices.
                  bf      Baseline fall, in Hz
                  hr      Hat rise, in Hz
                  sr      Stress rise, in Hz
                  as      Assertiveness, in percent
                  qu      Quickness, in percent
                  ap      Average pitch, in Hz
                  pr      Pitch range, in percent

                   Baseline Fall, bf
          The bf parameter (baseline fall in Hz) determines one aspect of
          the dynamic fundamental frequency contour for a sentence. If bf
          is 0, the reference baseline  fundamental frequency of a
          sentence begins at 115 Hz and ends at  this frequency. All
          rule-governed dynamic swings in f0 are  computed with respect to
          the reference baseline.

          Some speakers begin a sentence at a higher f0, and gradually
          fall  as the sentence progresses. This "falling baseline"
          behavior can  be simulated by setting bf to the desired fall in
          Hz. For example,  setting bf to 20 Hz will cause the f0 pattern
          for a sentence to  begin at 125 Hz (115 Hz plus half of bf), and
          fall at a rate of 16  Hz per second until it reaches 105 Hz (115
          Hz minus half of bf).  The baseline remains at this lower value
          until it is reset  automatically before the beginning of the
          next full sentence  (right after a period, question mark, or
          exclamation point). The  rate of fall, 16 Hz per second, is
          fixed, no matter what the  extent of the fall.
          Whenever you include a [+] phoneme in the text to indicate the
          beginning of a paragraph, the baseline is automatically set to



                                         88








          begin slightly higher for the first sentence of the paragraph.
          The  following sentences of a paragraph are all identical in
          having a  normal baseline fall.

          While baseline fall differs among the speakers, it is not a very
          good cue for differentiating between speakers. As long as the
          fall  is not excessive, its presence or absence is not
          particularly  noticeable.
















































                                         89








                   Hat Rise, hr

          The hr (nominal hat rise in Hz) and sr (nominal stress impulse
          rise in Hz) parameters determine aspects  of the dynamic
          fundamental frequency contour for a sentence. To  modify these
          values selectively, you should understand how the f0  contour is
          computed as a function of lexical stress pattern and  syntactic
          structure of the sentence.
          A sentence is first analyzed and broken into clauses with
          punctuation and clause- introducing words to determine the
          locations of clause boundaries. Within each clause, the f0
          contour  rises on the first stressed syllable, stays at a high
          level for  the remainder of the clause up to the last stressed
          syllable, and  falls dramatically on the last stressed syllable.
          This  rise-at-the-beginning and fall-at-the-end pattern has been
          called  the "hat pattern" by linguists, using the analogy of
          jumping from  the brim of a hat to the top of the hat, back down
          again.

          The hat rise parameter, hr, indicates the nominal height in
          hertz  of a pitch rise to a plateau on the first stress of a
          phrase. A  corresponding pitch fall is placed by rule on the
          last stress of  the phrase. Some speakers use relatively large
          hat rises and  falls, while others use a local "impulse-like"
          rise and fall on  each stressed syllable. The default hr value
          for Paul is 22 Hz, indicating that the f0 contour rises a
          nominal 22 Hz when going from the brim to the top of the hat. To
          simulate a speaker who does not use hat rises and falls, enter
          the  command [:dv hr 0].
          Other aspects of the hat pattern are important for natural
          intonation but are not accessible by speaker-definition
          commands.  For example, the hat fall becomes a weaker fall
          followed by a  slight continuation rise if the clause is to be
          succeeded by more  clauses in the same sentence. Also, if
          unstressed syllables follow  the last stressed syllable in a
          clause, part of the hat fall  occurs on the very last (unstressed)
          syllable of the clause. If  the clause is long, DECtalk PC may
          break it into two hat patterns by  finding the boundary between
          the noun phrase and verb phrase.

          If DECtalk PC is in phoneme input mode and you use the pitch
          rise [/]  and pitch fall [\] symbols, the hr parameter
          determines the actual rise and fall in Hz.
                   Stress Rise, sr

          The sr parameter indicates the nominal height, in Hz, of a local
          pitch rise and fall on each stressed  syllable. This rise-fall
          is added to any hat rise or fall that may  also be present.  For
          example, Paul has pr set to 32 Hz, resulting in an f0 rise-fall
          gesture of 32 Hz over a  span of about 150 ms, which is located
          on the first and succeeding  stressed syllables. However,
          DECtalk PC rules reduce the actual  height of successive stress
          rises and falls in each clause, and  cause the last stress pulse



                                         90








          to occur early so that there is time for the hat fall during the
          vowel.

          If the sr parameter is set too low, the speech sounds monotone
          within long phrases. Great changes to hr and sr from their
          default  values for each speaker are not necessary or desirable,
          except in  unusual circumstances.
                  Assertiveness, as

          Assertive voices have a dramatic fall in pitch at the end of
          utterances. Neutral or meek speakers  often end a sentence with
          a slight "questioning" rise in pitch to  deflect any challenges
          to their assertions. The as parameter, in  percent, indicates
          the degree to which the voice tends to end  statements with a
          conclusive final fall. A value of 100 is very  assertive, while
          a value of 0 is maximally meek.
                   Quickness, qu

          The qu parameter, in percent, controls  the speed of response to
          a request to change the pitch. All hat  rises, hat falls, and
          stress rises can be thought of as suddenly  applied commands to
          change the pitch, but the larynx is sluggish,  and responds only
          gradually to each command. A smaller larynx  typically responds
          more quickly, so while Harry has a quickness  value of 10, Kit
          has a value of 50.
          In engineering terms, a value of 10 implies a time constant
          (time  to get to 70 percent of a suddenly applied step target)
          of about  100 ms. A value of 90 percent corresponds to a time
          constant of  about 50 ms. Lower quickness values may mean that
          the f0 never  quite reaches the target value before a new
          command comes along to  change the target, but this is perfectly
          natural.

                   Average Pitch, ap, and Pitch Range, pr
          The ap (average pitch in Hz) and pr (pitch range in percent of
          normal range)  parameters modify the computed values of
          fundamental frequency,  f0, according to the formula:

                  f0' = ap + (((f0 - 120) * pr) / 100)
          If ap is set to 120 Hz and pr to 100 percent, there will be no
          change to the "normal" f0 contour that is computed for a typical
          male voice. The effect of a change in ap is simply to
          independently raise or lower the entire pitch contour by a  constant
          number of Hz, while the effect of pr is to expand or  contract
          the swings in pitch about 120 Hz.

          Normally, a smaller larynx simultaneously produces f0 values
          that  are higher in average pitch and higher in pitch range by
          about the  same factor (the whole f0 contour is multiplied by a
          constant  factor). Observing the values assigned to ap and pr
          for each of  the voices (Appendix D), you can see that the
          voices rank in  average pitch from low (Harry) to high (Kit).
          Rankings for pr are  similar, except that Frank has a flat,



                                         91








          non-expressive pitch range  compared with his average pitch.

          The best way to determine a good pitch range for a new voice is
          by  trial and error. You can create a monotone or robot-like
          voice by  setting the pitch range to 0. For example, to make
          Harry speak in  a monotone at exactly 90 Hz, type the following
          command.
                   [:nh :dv ap 90 pr 0] I am a robot.

          Reducing the pitch range reduces the dynamics of the voice,
          producing emotions such as sadness. Increasing the pitch range
          while leaving the average pitch the same or setting it slightly
          higher suggests excitement.
          Due to constraints involved in pitch-synchronous updating of
          other  dynamically changing parameters, the fundamental
          frequency contour  that is computed by the above formula is then
          checked for values  that are out of bounds with respect to the
          following limits.

                  f0 maximum = 500 Hz
                  f0 minimum = 50 Hz

          Any value outside this range is limited to fall within the
          range.
          To keep you from exceeding reasonable limits on the parameters
          controlling pitch, constraints have been placed  on values
          selected. If a [:dv _] command requests values outside  these
          limits, the request is limited to the nearest listed value
          before execution.

                  Changing Relative Gains and Avoiding Overloads

          Eight speaker-definition parameters control the output levels of
          various internal resonators. These parameters are listed below.


                  gv      Gain of voicing source, in dB
                  gh      Gain of aspiration source, in dB
                  gf      Gain of frication source, in dB
          gn      Gain of nasalization, in dB

                  g1      Gain of cascade formant resonator 1, in dB
                  g2      Gain of cascade formant resonator 2, in dB
                  g3      Gain of cascade formant resonator 3, in dB
                  g4      Gain of cascade formant resonator 4, in dB
          g5      Loudness of the voice, in dB










                                         92








                  Loudness g5

          Each predefined voice has been adjusted to have about the same
          perceived loudness, a value that is about  optimum for telephone
          conversation. The value chosen is near  maximum (if loudness
          were increased much, some phonemes would  probably cause an
          overload squawk). A near maximum value was  selected to maximize
          the signal-to-noise level of DECtalk PC.
          If you want to decrease the loudness of a voice, or make a
          temporary increase for a phrase that is known not to overload,
          determine the g5 value in dB for the voice in question  Then
          adjust the voice by using the  following command.

                  [:np :dv g5 76]. I am speaking at about half my normal
          level.
          Because the g5 entry  for Paul is 86, this command  reduces
          loudness by 10 dB. Perceived loudness approximately  doubles (or
          halves) for each 10 dB increment (decrement) in g5.

          Software control over loudness is useful in a loudspeaker
          application where the background noise level in the room might
          change. For example, a vocally handicapped wheelchair-bound
          person  does not want to appear to be shouting in a quiet
          interpersonal  conversation, but may wish to be able to converse
          in a noisy room  as well. Using a software abbreviation
          facility, such a person  could type lo to select a command
          making the voice maximally loud,  and sof to invoke a command
          setting lo to a reduced value.
          Note: DECtalk PC comes with both software and hardware volume
          control so that the modification of the g5 parameter should not
          be necessary. Using the [:volume ...] command or the volume
          control know on the external loudspeaker is recommended.

                  Sound Source Gains, gv, gh, gf and gn
          Several ypes of sound sources are activated during speech
          production: voicing,  aspiration, and frication as well as
          nasalization. The relative output levels of these  sounds, in
          dB, are determined by the gv, gh,  gfand gn  parameters
          respectively. The default settings  for  these parameters have
          been factory pre-set to maximize the  intelligibility of each
          voice. However, changing the settings can  be useful in
          debugging the system or in demonstrating aspects of  the
          acoustic theory of speech production. You could change the
          level of one sound source globally, for example, turn off
          frication to be able to hear just the output of the larynx.
          These  parameters might have to be reduced to overcome certain
          kinds of  overloads, but try the procedure in the next section
          first.








                                         93








                  Cascade Vocal Tract Gains, g1, g2, g3, and g4

          Changes in head size or other parameters can sometimes produce
          overloads  in the synthesizer circuits. If this occurs, first
          check to see  that f4 and f5 are set to reasonable values. If
          the squawk  remains, you can adjust several gain controls, g1
          through g4 in  dB, in the cascade of formant resonators of the
          synthesizer to  attenuate the signal at critical points. These
          gains can then be  amplified back to desired output levels later
          in the synthesis.
          Use the following procedure to correct an overload (typically
          indicated by a squawk during part of a word).

          1.      Synthesize the word or phrase several times to make sure
                  the squawk occurs consistently. Use the same test word
          each time one                    of the following changes to a
          gain is made.
          2.      Determine the default values for g1 through g4 for the
          speaker that                    overloads.

          3.      Reduce g1 by an increment of 3 at a time until the
          squawk goes away. When the         squawk goes away, note the
          reduction  that was needed. If more than a 10 dB decrement is
          required, some other parameter has probably been changed  too
          much. If the squawk does not go away at all, then you  may need
          to reduce gv instead of g1.
          4.      Add this increment to g5 to return the output to its
          original level. For example, if g1 was reduced by 6 dB,  add 6
          dB to lo (or g4 if lo is already at a maximum). If  incrementing
          lo causes the squawk to return, then  decrease lo slowly until
          the squawk goes away.

          This procedure works in most cases, but using g2 rather than g1
          can work better. If you can return g1 to its factory pre-set
          value and reduce g2 instead to make the squawk go away, then
          the signal-to-quantization-noise level in g1 remains maximized.
          If  you can fix the squawk by using g3 or g4 rather than g2,
          more of  the cascaded resonator system can be made immune to
          quantization  noise accumulation.
                  The [save ] Parameter and [:nv] Voice


          You can save a modified speaker definition in a buffer while
          synthesizing speech with one of the other voices. The Variable
          Val  voice [:nv] is either male or female, depending on what
          values are stored in the buffer. If you call Val before storing
          any values in  the buffer, DECtalk PC uses the Perfect Paul
          voice [:np]. The  following commands store a modified Betty
          voice in Val and then  recall it.
                  [:nb :dv sex m save ]
                  (Store the modified Betty voice in Val.)

                  [:np] I am Paul.



                                         94








                  (Use another voice.)

                  [:nv] I am Val.
                  (Recall the Val [modified-Betty] voice.)
          The buffer holds its contents until you power down DECtalk PC.
          You  must re- enter new voice characteristics if you turn off
          DECtalk PC.

          Note: If you wiosh to use the save command, leave a space
          between the command and the trailing bracket, eg., [:dv save ]
                  Summary on Speaker-Definition Parameters


          Of the 27 parameters, only a few cause dramatic changes in the
          voice. The greatest effects are obtained with changes to hs, ap,
          pr, and sx, while moderate changes occur when modifying la and
          br.  To some extent, DECtalk PC's nine factory-set speakers
          cover most of  the possible voices, so don't expect to be able
          to find a voice  that is highly novel and intelligible. However,
          you might easily  find ways to slightly improve one of the
          standard voices.
                  VOICE COMMAND SYNTAX


          DECtalk PC uses the following voice command syntax rules.
                  1.      Begin every command with a colon (:).

                  2.      Separate each command and its parameter(s) from
          the text
                          by a valid word boundary marker such as a space,
          tab, or  car                    riage return.
                  3.      You can include several commands in the same
          square                          bracket set.

                          [:ra 150 :nb] Hello. How are you?
                  4.      You can include several parameters in the same
          square                           bracket set if the command
          allows more than one  parameter.                          If you
          use several parameters, you must give  them all before
          a second command in the same square  bracket set.

                          [:dv ap 160 pr 50 save :nv] Hi there.
                          (The parameter group modifies the [:dv _]
          command.)
                          [:dv ap 160 save  :nv pr 50] Hi there.
                          (Wrong. The parameter group is out of place.)

                  5.      If you give two conflicting parameters or
          commands,                               DECtalk PC will  use the
          last command in the sequence. For                       example,
          if you type
                          [:nb :np] Hello.




                                         95








                          DECtalk PC will use Paul's voice.

                  6.      You can use phonemic symbols in the same square
          brackets                                with voice commands.
                          Now I'm [:dv ap 90 pr 130 r"iyliy] thrilled!

                  7.      If the value in a [:dv _] command is too low,
          DECtalk PC                               will use the minimum
          valid value. If the value is too  high,
          the maximum valid       value will be used.
                  8.      Once you give a command, that command applies to
          all fur-                                   ther text until
          overridden by another command. For  example,                 the
          command [:nk] will make DECtalk PC use Kit's voice on
          all entered text until you enter another new voice command.

                  9.      All [:dv _] commands are lost when you power
          down                            DECtalk PC.
                  10.     Invalid commands are ignored. By setting [:error
          ...] com                                mand, you can receive an
          audible warning that an invalid
          command has been entered.





          TEXT TUNING EXAMPLE

          The following is an example of how to tune text. Speech
          synthesis technology allows for more natural text-to-speech with
          each passing year. However, there are still areas in the speech
          which can be "tuned-up" a bit to make for a bit more
          naturalness. Much of this involves the strategic placing of
          commas and periods which essentially tell the DECtalk to pause,
          as a native speaker of English would do when speaking the same
          text. This is due to the fact that the spoken language and
          written text are different where written text often does not
          contain infomration about pausing.
           The text below is presented twice, the first time as
          originally written, and the second time after phonemic and
          textual  fixes were applied.

          Original Version
          [:np]

               A California Shaggy Bear Tale for Seven DECtalk Voices

                                   by Dennis Klatt

          [:np] Once upon a time, there were three bears. They lived in
          the  great forest, and tried to adjust to modern times
          [:nh] I'm papa bear. I love my family but I love honey best.



                                         96








          [:nb] I'm mama bear. Being a mama bear is a drag.

          [:nk] I'm baby bear and I have trouble relating to all of the
          demands of older bears.
          [:np] One day, the three bears left their condominium to search
          for honey. While they were gone, a beautiful young lady snuck
          into  the bedroom through an open window.

          [:nw] My name is Whispering Wendy. My purpose in entering this
          building should be clear. I am planning to steal the family
          jewels.
          [:np] Hot on her trail was the famous police detective, Frail
          Frank.

          [:nf] Have you seen a lady carrying a laundry bag over her
          shoulder?
          [:np] A woman kneeling with her left ear firmly placed against a
          large rock responded.

          [:nu] No. No one passed this way. I've been listening for
          earthquakes all morning, but have only spotted three bears
          searching for honey.
          Changed Version

          [:np]
          Add periods after the title and author.

          A California Shaggy Bear Tale for Seven DECtalk Voices.
          By Dennis Klatt.

          Make phonemic corrections.
          [:nh] This story was used to demonstrate DECtalk at ['aykaesp]
          84,  in May of 1984, at San Diego California.

          [:np] Once upon a time, there were three bears. They lived in
          the  great forest and tried to adjust to modern times.
          Add commas and emphatic stress.

          [:nh] I'm papa bear. I love my family, but I love ["]honey best.
          [:nb] I'm mama bear. Being a mama bear is a drag.

          [:nk] I'm baby bear and I have trouble relating to all of the
          demands of older bears.
          Begin a verb phrase.

          [:np] One day, the three bears [)] left their condominium to
          search for honey. While they were gone, a beautiful young lady
          snuck into the bedroom through an open window.
          [:nw] My name is Whispering Wendy. My purpose in entering this
          building should be clear. I am planning to steal the family
          jewels.

          Begin a new paragraph.



                                         97








          [:np] [+] Hot on her trail was the famous police detective,
          Frail  Frank.

          [:nf] Have you seen a lady carrying a laundry bag over her
          shoulder?
          Add commas for phrasing.

          [:np] A woman, kneeling with her left ear firmly placed against
          a  large rock, responded.
          Add pitch control and emphatic stress.

          [:nu] ["]No. No [/]one passed this [/\]way. I've been listening
          for"["]earthquakes all morning, but have only spotted three
          bears  searching for honey.










































                                         98










                                              CHAPTER 7


                                    DEVELOPING AN

                         ADVANCED        SPEECH APPLICATION




           The development process described in this manual assumes that
          your  application has full control over the text being spoken.
          If you  are developing an application that must read arbitrary
          text (such  as electronic mail messages), your task is more
          difficult because almost anything can appear in the text.
          DECtalk PC is controlled by your PC. Even  the smallest personal
          computer has enough power to pre-process  (filter) text to
          handle application-unique cases. So, you can put
          application-specific text filters in the controlling computer,
          rather than add many additional special cases (and switches to
          enable and disable them) to DECtalk.

          For an electronic mail system, you can program an electronic
          mail  pre-processor to make the following text conversions
          before sending  the text to DECtalk.
          1.      Parse the header boilerplate to remove extraneous
          information.

          2.      If DECtalk is speaking paragraphs of text, add the [+]
          symbol  to a blank       line separating each paragraph.
          3.      If words are separated by / or another special
          characters or punctuation, and if DECtalk pronounced them when
          you do not want them pronounced,  check to see if punctuation is
          turned off. If it is not, turn punctuation off with the [:punc
          none] command. You may also replace such characters or
          punctuation with a space (for example, "Raleigh/Durham" could
          become "Raleigh Durham" for DECtalk to say it  without spelling
          out the entire string).

          4.      Create your own application-specific dictionary for
          words, such as proper names, that DECtalk mispronounces. If
          DECtalk is connected toa data base containing names, consider
          adding a pronunciation field to the name record,  entering
          phonemic text when appropriate.
          Note: DECtalk PC is now able to handle many proper names and
          addresses quite well using the [:pronounce name] or [::mode name
          ...] commands.

          5.      Scan the text for strings of numbers in a format
          understandable to your          application but not to DECtalk.
          For example, if you can extract the time format from an



                                         99








          electronic mail message, you can add code to your application to
          expand it to its "o'clock" form.

           6.     In many applications, the listener will want to write
          down number     strings (such as prices or telephone  numbers).
          Your application can scan the text for strings  of numbers and,
          when found, send them to DECtalk in a way  that includes pauses
          at critical locations. For example:
                           The number is, 1 (800) 5 5 5, 1 2 3 4. [:ra
          120]


                   That is, [_<300>] 1 (800), [_<500>] 5 5 5, [_<900>] 1 2
          3 4. [:ra 180].
          The spaces between the numbers ensure that "five five  five" is
          spoken rather than "five hundred and fifty  five." (You may also
          use the [:mode spell on] command. The slower speaking rate and
          the silence phonemes of specified durations were carefully
          selected to permit  enough time for the listener to write down
          the entire  number. Silence phonemes were positioned after an
          orthographic comma maintain appropriate intonation.

                   As another example, if your application speaks money
          (such as bank balances or item costs), it might say
                           Your balance is $244.05
                           That is, 2 4 4, [_<400>] point 0 5, [_<400>]
          dollars.


           7.     When spelling an item out, your application may have to
          distinguish the case of letters. Consider using different
          voices to distinguish between uppercase and lowercase  letters
          (for example, Harry and Paul). Some screen-reading software
          provides this functionality.


                  OPTIMIZING THE QUALITY OF SPOKEN TEXT

          In some applications, it is important to get a few sentences to
          sound very good because they are used often.  Usually DECtalk
          does an excellent job, but the phrasing can occasionally be
          improved.  In these cases, you may wish to  improve the quality
          of a particular sentence. The following steps  are suggested.

          1.      Send the sentence to DECtalk and listen repeatedly,
          focusing on each word in turn to detect any  mispronunciations.
           2.     For each word that is mispronounced, there are several
          alternatives to get the corect pronunciation.

           For words that have two alternate pronunciations, (see Appendix
          C for a complete list),  DECtalk typically picks the more common
          of the two. If the other pronunciation is desired, simply enter
          it in phonemic text. DECtalk soon will be able to choose the



                                         100








          correct pronunciation by itself. For example, if you type the
          following sentences:

                  He produced a lot of refuse.
                  He refused the produce.

                  He inserts 5 inserts per minute.
                  He  deliberated  deliberately for a long time.

          You can see that some of these alternately pronounced words are
          incorrect.   However,  even at such time as DECtalk is able to
          do most of these these automatically, no procedure is infallible
          and there are times when you my need to assist the correct
          pronunciation.
           You can correct such mistakes in two ways:


                  a.      Replace the correct spelling of the word with a
          clever mis                              spelling.
                           I red yesterday that . . .

                  b.      Phonemicize the text
                          I [r'ehd] yesterday that . . .

                  c.      Use the slash notation,
                          I  /read yesterday that . . .

          3.              If the word is a compound, use a hyphenated
          spelling to help                                 DECtalk see the
          two parts of the compound.
                           The slide-show host . . .


          4.              Replace the text version by a phonemic string.
          Use the com                                mands and phonemic
          symbols described in  above. Be sure
          to place the lexical  stress  pattern correctly.
          5.              Sometimes, a word does not sound quite right
          even when the                            best alternate phonemic
          representation is selected.  Usually,
          such subtle pronunciation defects are not  correctable.

          6.              Now that each word has been pronounced in the
          best  possi                              ble way, listen to the
          total sentence rhythm and  accent pat
          tern. If it is not right, try each of the  following steps.
          7.              If it sounds like there should be a short pause
          in a particular                                         sentence
          location, but DECtalk says the  sentence without a
          pause, try inserting  a comma between  the words in question.

          8.              If the wrong word is emphasized in the sentence,
          try to point                            out the word that should



                                         101








          receive most emphasis by  placing a
          phonemic emphasis symbol before it.

                           The ["]younger man is the trouble-maker, not
          the older one.

          9.              Use the pitch control symbols [/], [\], and [/\]
          to make final                           adjustments.

          10.             If none of these steps gives you a satisfactory
          sentence,
                          you can still specify durations and fundamental
          frequency                               motions for all phonemes
          with the commands described                               above.
          To avoid too much trial-and-error effort,  you should
          have access to a speech analysis facility to  analyze a record
          ing of the way the sentence should sound.



                                  COMMON ERRORS


           When using DECtalk, try to avoid making these two common, major
          errors.
          1.      Forgetting to change back to default voice


          If you forget to return DECtalk to the standard Paul voice after
          using one of the other voices, all future text will use the
          voice currently selected. It is a good programming habit to
          return to  Paul's voice (or the default voice) after every text
          message.
          2.      Accidental entry into phonemic mode


          If [:phoneme arpabet speak] is on, permitting phonemic input, it
          is  possible to get into phonemic mode unintentionally. If the
          text contains an unexpected [, or if you forget to type ] after
          a  phonemic entry, DECtalk is left in a state where it tries to
          interpret text phonemically. This error makes DECtalk garble
          speech. In fact, DECtalk is simply doing the best it can to
          interpret text phonemically, discarding phonemically illegal
          letters. This problem can be avoided by placing one closed
          square (phonemic) brackets at the beginning of your text along
          with your speaking rate and voice commands, e.g.,
                  ]   [:ra 220] [:nh]

                  [+] Ladies and Gentlemen ...







                                         102
































































                                         103













                                     APPENDICES


















































                                         104








                          APPENDIX  A

                        Configuration for the DECtalk PC Board

          Specifications
          Interrupt Vector Assignments

                  Switch Setting  Available Selections
          1           2           3       4       5      IRQ

          ON          OFF         OFF     OFF     OFF    IRQ3  - Default
          OFF         ON          OFF     OFF     OFF    IRQ4

          OFF         OFF         ON      OFF     OFF    IRQ5
          OFF         OFF         OFF     ON      OFF    IRQ6

          OFF         OFF         OFF     OFF     ON     IRQ7
          Note: In switches 1-5, only ONE switch can be ON.

          Address Assignments
                  I/O addresses

                   6          7
                  OFF  OFF        240- 24F

                  ON  OFF         250 - 25F
                  OFF   ON                340 - 34F - Default

                  ON   ON         350 - 35F
                  BIOS Addresses

                    8         9
                  OFF  OFF        C0000 - C7FFF

                  ON  OFF         C8000 - CFFFF
                  OFF  ON         D0000 - D7FFF

                  ON   ON         D8000 - DFFFF - Default
          The DIP-SWITCH pack default settings are therefore as follows:

          1 ON;   2 OFF;  3 OFF;    4 OFF;   5 OFF;    6 OFF;   7 ON;   8
          ON;  9 ON














                                         105










                                                 APPENDIX  B


                                     DECtalk PC PHONEMIC SYMBOLS


          Several English phonemic alphabets are widely used today. The
          Table below  lists the phonemic alphabet that DECtalk use, along
          with an example of each sound. Some dictionaries put the stress
          symbol after the vowel nucleus or  at the start of the syllable.
          DECtalk requires that the stress  symbol appear immediately  before
          the vowel.




                  Phonemic Symbol         Example


                                  Vowels


                          ey              bake

                          aa              Bob
                          iy              beat

                          eh              bet
                          ay              bite

                          ih              bit
                          oy              boy

                          ow              boat
                          uw              lute

                          ah              but
                          aw              bout

                          yu              cute
                          rr              bird

                          ao              bought
                          ae              bat

                          uh              book
                          ix              kisses

                          ax              about





                                         106








                                  Consonants

                          p               pet
                          b               bet

                          t               test
                          d               debt

                          k               Ken
                          g               guess

                          f               fin
                          v               vest

                          th              thin
                          dh              this

                          s               sit
                          z               zoo

                          sh              shin
                          zh              measure

                          ch              chin
                          jh              gin

                          m               met
                          n               net

                          nx              sing
                          w               wet

                          y               yet
                          hx              head

                          r               red
                          l               let

                          el              bottle
                          en              button

          Note: the [em] phoneme in the early version of DECtalk is no
          longer valid but can be replaced with the sequence [axm].
                                  Allophones

                          rx              oration (postvocalic r)
                          lx              electric (postvocalic l)

                          q               we eat (glottal stop)
                          dx              rider (flap d)

                          tx              Latin (glottalized t)




                                         107








                                  Stress and Syntactic Symbols


                          '               primary stress

                          `               secondary stress
                          "               emphatic stress

                          -               syllable boundary
                          *               morpheme boundary

                          #               word boundary (compound nouns)
                          <SP>            word boundary

                          <TAB>           word boundary
                          <RET>           word boundary

                          (               beginning of relative clause
                          )               end of relative clause

                          ,               end of clause (same as comma)
                          .               end of normal sentence

                          ?               end of question
                          !               end of exclamation

                          _               silence (underscore symbol)



                                  TONES


          DECtalk can also be used to sing songs or make various sounds
          associated with singing and tones. The following is a table
          which will allow the developer to more easily encode a phonemic
          sequence to produce such sounds.

          Note: In DECtalk DTC01, the pitch was calibrated to a physical
          scale. The new pitches are now calibrated to a musical scale.
          This will put them on the same scale as musical instruments
          (i.e. middle A = 440 Hz rather than 430.4 Hz).



          Number  Note    Pitch (Hz)




          1       C2      65
          2       C#      69




                                         108








          3       D       73

          4       D#      77
          5       E       82      B

          6       F       87      A
          7       F#      92      S

          8       G       98      S       B
          9       G#      103     |       A

          10      A       110     |       R
          11      A#      116     |       I

          12      B       123     |       T
          13      C3      130     |       O       T

          14      C#      138     |       N       E
          15      D       146     |       E       N

          16      D#      155     |       |       O
          17      E       164     |       |       R

          18      F       174     |       |       |      A
          19      F#      185     |       |       |      L

          20      G       196     |       |       |      T
          21      G#      207     |       |       |      O

          22      A       220     |       |       |      |
          23      A#      233     |       |       |      |

          24      B       247     |       |       |      |       S
          25      C4      261     |       |       |      |       O

          26      C#      277     |       |       |      |       P
          27      D       293     |       |       |      |       R

          28      D#      311     |       |       |      |       A
          29      E       329     |       |       |      |       N

          30      F       348             |       |      |       O
          31      F#      370             |       |      |       |

          32      G       392             |       |      |       |
          33      G#      415                     |      |       |

          34      A       440                     |      |       |
          35      A#      466                     |      |       |

          36      B       494                     |      |       |
          37      C5      523                            |       |




                                         109








          Note: C4 is middle C























































                                         110










                                            APPENDIX  C


                                     Homographs



          Homographs are pairs of words which are spelled exactly the same
          but are pronounced  differently. These are often different in
          terms of which syllable is accented. For example, if permit is a
          noun, the accent is on the first syllable (permit); if, however,
          the word is used as a verb, then the accent is on the second
          syllable (permit). This often makes a great deal of difference
          in understanding DECtalk when it is speaking such words in
          connected discourse.
          In earlier versions of DECtalk, the default form was always the
          noun. In later versions of DECtalk, the default form is the more
          frequent form of the two. In the event the alternate
          pronunciation is needed, you may insert the correct phonetics
          from the list below. You may also obtain the alternate
          pronunciation bypreceding  the word with a slash ("/"). For
          example,  the word sow 'to mend or make clothing' will be the
          default pronunciation. The pronunciation of the same word sow
          "female pig' is done by placing a slant immediately before the
          word, e.g, /sow. DECtalk PC handles more homographs than any
          previous version of DECtalk.

          The Table below is a new and expanded list of the common
          homographs of English with alternative pronunciations in
          phonetic transcription. In those cases where DECtalk PC does not
          chose the correct pronunciation, you can simply use slants or
          else insert the correct one phonetically from the list below.


          SPELLING        PRIMARY ALTERNATE
          abstract                'aebstraekt     aebstr'aekt
          abuse           axby'uz         axby'us
          addict          axd'ihkt                'aedihkt
          advocate                'aedvaxkeyt     'aedvaxkaxt
          affix           'aefihks                axf'ihks
          ally            'aelay          axl'ay
          alternate               'aoltrrnaxt     'aoltrrneyt
          animate         'aenihmeyt      'aenihmaxt
          annex           'aenehks                axn'ehks
          appropriate     axpr'owpriyaxt  axpr'owpriyeyt
          arithmetic              axr'ihthmaxtixk aerixthm'ehtixk
          articulate              aart'ihkyeleyt  aart'ihkyelaxt
          associate               axs'owshiyeyt   axs'owshiyaxt
          attribute               axtr'ihbyuwt    'aetrixbyuwt
          august          'aogaxst                aog'ahst
          bass            b'eys           b'aes



                                         111








          baton           baxt'aon                b'aetaxn
          close           kl'owz          kl'ows
          combat          kaxmb'aet       k'aambaet
          combine         kaxmb'ayn       k'aambayn
          compact         kaxmp'aekt      k'aampaekt
          complex         k'aamplehks     kaxmpl'ehks
          compound        k'aampawnd      kaxmp'awnd
          compress                kaxmpr'ehs      k'aamprehs
          concert         k'aansrrt               kaxns'rrt
          conduct         kaxnd'ahkt      k'aandahkt
          confederate     kaxnf'ehdrrixt  kaxnf'ehdrreyt
          confine         kaxnf'ayn               k'aanfayn
          conflict                k'aanflihkt     kaxnfl'ihkt
          conglomerate    kaxnxgl'aamrixt kaxnxgl'aamrreyt
          console         k'aansowl       kaxns'owl
          construct               kaxnstr'ahkt    k'aanstraxkt
          content         k'aantehnt      kaxnt'ehnt
          contest         k'aantehst      kaxnt'ehst
          contract                k'aantraekt     kaxntr'aekt
          contrast                k'aantraest     kaxntr'aest
          converse                k'aanvrrs              kaxnv'rrs
          convert         kaxnv'rrt               k'aanvrrt
          convict         kaxnv'ihkt      k'aanvihkt
          coordinate      kow'aordeneyt   kow'aordixnaxt
          decrease                diykr'iys              d'iykriys
          defect          daxf'ehkt               d'iyfehkt
          delegate                d'ehlixgaxt     d'ehlixg`eyt
          deliberate              daxl'ihbrraxt   daxl'ihbrreyt
          desert          d'ehzrrt                dixz'rrt
          desolate                d'ehselixt             d'ehseleyt
          diffuse         dixf'yuws               dixf'yuwz
          digest          d'ayjhehst      dayjh'ehst
          discharge               dixsch'arjh     d'ihscharjh
          discount                d'ihskawnt      dihsk'awnt
          dove            d'owv           d'ahv
          duplicate               d'uwplixkeyt    d'uwplixkaxt
          elaborate               axl'aebrraxt    axl'aebrreyt
          estimate                'ehstixmeyt     'ehstixmaxt
          excerpt         'ehksrrpt               ehks'rrpt
          excuse          ixksky'uz               ehksky'us
          expatriate              ehksp'eytriyaxt ehksp'eytriieyt
          exploit         ixkspl'oyt              'ehksployt
          export          ehksp'ort               'ehksport
          extract         ehkstr'aekt     'ehkstraekt
          ferment         frrm'ehnt               f'rrmehnt
          frequent                fr'iykwixnt     friykw'eynt
          geminate                jh'ehmixnaxt    jh'ehmixneyt
          graduate                gr'aejhuweyt    gr'aejhuwaxt
          impact          'ihmpaekt               ixmp'aekt
          implant         ihmpl'aent      'ihmplaent
          import          'ihmport                ihmp'ort
          imprint         'ihmprihnt      ihmpr'ihnt
          incense         ixns'ehns               'ihnsehns



                                         112








          incline         ixnkl'ayn               'ihnklayn
          increase                ihnkr'iys              'ihnkriys
          insert          ihns'rrt                'ihnsrrt
          insult          ihns'ahlt               'ihnsaxlt
          interchange     'ihntrrcheynjh  ihntrrch'eynjh
          intimate                'ihntaxmaxt     'ihntaxmeyt
          invalid         ixnv'aelixd     'ihnvaxlixd
          just            jhixst          jh'ahst
          lead            l'iyd           l'ehd
          live            l'ihv           l'ayv
          minute          m'ihnixt                mayn'uwt
          miscount                m'ihskawnt      mihsk'awnt
          misprint                m'IsprInt              m|spr'Int
          misuse          mixs'yuz                mixs'yus
          moderate                m'aadrraxt      m'aadrreyt
          object          'aabjheht               axbjh'ehkt
          overrun         'owvrrrahn      owvrrr'ahn
          perfect         p'rrfixkt               prrf'ehkt
          permit          prrm'iht                p'rrmiht
          pervert         prrv'rrt                p'rrvrrt
          polish          p'aalihsh               p'owlixsh
          postulate               p'aascheleyt    p'aaschelaxt
          predicate               pr'ehdixkeyt    pr'ehdixkaxt
          predominate     prixd'aamixneyt prixd'aamixnaxt
          present         priyz'ehnt      pr'ehzaxnt
          proceed         praxs'iyd               pr'owsiyd
          produce         praxd'uws       pr'aaduws
          progress                pr'aagrehs      praxgr'ehs
          project         pr'aajhehkt     praxjh'ehkt
          protest         pr'owtehst      prowt'ehst
          read            r'iyd           r'ehd
          reading         r'iydixnx               r'ehdixnx
          rebel           r'ehbel         rixb'ehl
          recall          rixk'aol                r'iykaol
          recap           riyk'aep                r'iykaep
          recess          r'iysehs                riys'ehs
          record          r'ehkrrd                rixk'ord
          recount         riyk'awnt               r'iykawnt
          refill          r'iyfihl                riyf'ihl
          refresh         riyfr'ehsh              r'iyfrehsh
          refund          riyf'ahnd               r'iyfahnd
          refuse          rixf'yuz                r'ehfyus
          reject          rixjh'ehkt              r'iyjhehkt
          relapse         r'iylaeps               rixl'aeps
          relay           r'iyley         rixl'ey
          remake          r'iymeyk                riym'eyk
          rerun           r'iy*rahn               riy*r'ahn
          research                r'iysrrch              riys'rrch
          resume          riy|z'uwm       r'ehzaxmey
          retake          riyt'eyk                r'iyteyk
          rewrite         riyr'ayt                r'iy*rayt
          segment         s'ehgmixnt      sehgm'ehnt
          separate                s'ehpaxreyt     s'ehpaxraxt



                                         113








          sow             s'ow            s'aw
          subject         s'ahbjhehkt     saxbjh'ehkt
          sublet          saxbl'eht               saxbl'eht
          subordinate     saxb'ordenaxt   saxb'ordeneyt
          survey          s'rrvey         srrv'ey
          suspect         s'ahspehkt      saxsp'ehkt
          syndicate               s'ihndixkixt    s'ihndixkeyt

          tear            t'er            t'ir
          torment         torm'ehnt               t'ormehnt
          transform               traensf'orm     tr'aensform
          transplant              traenspl'aent   tr'aensplaent
          transport               traensp'ort     tr'aensport
          upset           axps'eht                'ahpseht
          use             y'uwz           y'uws
          wind            w'ihnd          w'aynd
          wound           w'awnd          w'uwnd







































                                         114
































































                                         115








                                     APPENDIX  D



                                  VOICE PARAMETERS

          The following are the parameters which can be used to change
          DECtalk voices:
          Parameter Voice Name      Characteristics
          :np       Perfect Paul           Standard male voice
          :nb       Beautiful Betty Standard female voice
          :nh       Huge Harry      Deep male voice
          :nf       Frail Frank            Older male voice
          :nk       Kit the Kid             Child's voice (about 10 years
          old)
          :nr       Rough Rita      Deep female voice
          :nu       Uppity Ursula   Light female voice
          :nd       Doctor Dennis   Breathy male voice
          :nw       Whispering WendyWhispery female voice
          :nv       Variable Val           Definable voice

          This section explains the various voice parameters which can be
          used to make modifications to the existing voices or create a
          new voice.  See Chapter 6 for a detailed explanation of each of
          the vocal tract parameters.

          Speaker Definition [:dv  _] Parameter Range
          Parameter Minimum    Maximum     Unit    Function
          save      --    --   --   Save current speaker definition in
          variable buffer.
          Vocal Tract Parameters

          sx        0     1    --   Set sex to female (0)  or male (1)
          hs        65    145  %    Head size
          f4        2000  4650 Hz   Fourth formant frequency
          f5        2500  4950 Hz   Fifth formant frequency
          b4        100   2048 Hz   Fourth formant bandwidth
          b5        100   2048 Hz   Fifth formant bandwidth


          Voicing Sound Source Parameters

          br        0     72   dB   Breathiness
          lx        0     100  %    Lax breathiness
          sm        0     100  %    Smoothness (high frequency
          attenuation)
          ri        0     100  %    Richness
          nf        0     100  --   Number of fixed samplings of glottal
          pulse                            open phase
          la        0     100  %    Laryngealization


          Intonation Parameters



                                         116








          bf        0     40   Hz   Baseline fall
          hr        2     100  Hz   Hat rise
          sr        1     100  Hz   Stress rise
          as        0     100  %    Assertiveness
          qu        0     100  %    Quickness
          ap        50    350  Hz   Average pitch
          pr        0     250  %    Pitch range


          Gain Adjustment Parameters

          gv        0     86   dB   Gain of voicing source
          gh        0     86   dB   Gain of aspiration source
          gf        0     86   dB   Gain of frication source
          gn        0     86   bB   Gain of nasalization
          g1        0     86   dB   Gain of cascade formant resonator
          g2        0     86   dB   Gain of cascade formant resonator
          g3        0     86   dB   Gain of cascade formant resonator
          g4        0     86   dB   Gain of cascade formant resonator
          g5        0     86   dB   Gain of cascade formant resonator
          (replaces lo)



































                                         117










          Speaker Definitions for all DECtalk Voices
          Parameter    Paul   Harry  Frank   DennisBetty Ursula  WendyRita
          Kit
          sx      1     1      1     1       0     0     0       0    0
          hs      100   115   90     105     100   95    100     95   80
          f4      3300  3300  3650   3200    4450  4500  4500    4000 2500
          f5      3650  3850  4200   3600    2500  2500  2500    2500 2500
          b4       260   200    280  240     260   230   400     250  2048
          b5      330   240   300    280     2048  2048  2048    2048 2048
          br      0     0     50     38      0     0     55      46   47
          lx      0     0     50     70      80    50    80      0    75
          sm      3     12    46     100     4     60    100     24   5
          ri      70    86    40     0       40    100   0       20   40
          nf       0    10    0      10      0     10    10      0    0
          la       0           0     5       0     0     0       0    4
          0
          bf      18    9     9      9       0     8     0       0    0
          hr      18    20    20     20      14    20    20      20   20
          sr      32    30    22     22      20    32    22      32   22
          as      100   100   65     100     35    100   50      65   65
          qu      40    10    0      50      55    30    10      30   50
          ap      122   89    155    110     208   240   200     106  306
          pr      100   80    90     135     140   135   175     80   210
          gv      65    65    63     63      65    65    51      65   65
          gh      70    70    68     68      70    70    68      70   70
          gn      74    73    75     76      72    74    75      73   71
          gf      70    70    68     68      72    70    70      72   72
          g1      68    71    63     75      69    67    69      69   69
          g2      60    60    58     60      65    65    62      72   69
          g3      48    52    56     52      50    51    53      48   52
          g4      64    64    66     61      56    58    55      54   50
          g5      86    81    86     84      81    80    83      83   73





                                        Index





             Digital Signal Processor, 11
                                          A


             Abbreviations, 50
               in dictionary, 51
             Allophones, 62
             Amplifier, 11



                                         118








             Applications
               advanced, 99
                                          B


             BIOS, 11
               Interface, 16
             BIOS,, 13
               Parameters Block, 14
               Self-Test, 15
             Boundary
               clause, 72
               morpheme, 69
               syllable, 69
               word, 68
             Buffer
               Size, 21

                                          C

             Common errors, 102
             Compatibility Mode, 19
             Compound nouns, 70
             Consonants
               allophones, 63
               syllabic, 62

                                          D

             Data Synchronization, 40
             Dates, 50
             dictionary
               lookup, 18
               pronunciation accuracy, 18
               user, 21
               user dictionary, 60
             Driver
               Configuratio Options, 23
             Duration
               control of, 75
             Durations, 18
               comma pause, 82
               pause, 82

                                          E

             Exclamation point, 74

                                          F

             Flush Speaking, 35

                                          H



                                         119








             Homographs, 58
                                          I


             Installation Guide, 8, 11
             Installation., 12
             Intonation, 89
             IRQ
               Sensing, 15

                                          L

             letter-to-sound, 18

                                          M

             Microprocessor, 11
             Module
               Configuration, 12
               Controls, 12
             Module Configuration, 12
                Controls
                  Configuration, 12

                                          N

             Number: Processing
               rules, 46
             Numbers
               cardinal, 47
               money amounts, 49
               ordinal, 49
               Par, 46
               part, 46

                                          P

             Paragraph
               new, 74
             parser
               sentence, 18
               word, 18
             Period, 73
             Period Pause, 36
             Phonemics, 54, 58
               consonant phonemes, 59
               correction, 60
               theory, 57
               transcription, 54
               vowel phonemes, 59
             Pitch
               control of, 68, 75
             Power, 10



                                         120








               Power Supply
                  Power Requirements, 11
             Power Supply, 11
               Requirements, 11
             Pronunciation, 21
               Accuracy, 21
               Heuristics, 21
             Pronunciation.
               errors, 55
               how to correct, 56
             Proper names, 99
             Punctuation
               Speak, 38
                                          Q


             Question mark, 73

                                          R

             Resume Speaking, 35

                                          S

             Select Voice, 42
             Self-Test, 12, 13
             Serial Line, 19
             Silence phoneme, 64
             Singing, 77
             Softloadability, 19
             Software, 12
             Speaking
               Stop, 20
             Speaking Rate, 81
             Speech
               rate, 22
             Speech Control
               Command Set, 28
               Commands, 28
             Speech quality, 100
               optimization of, 100
             Spelling
               word spellout strategies, 51, 52
             Stress, 64
               emphatic, 67
               primary, 65
               secondary, 66
               symbols, 65
             Syllables
               unstressed, 67
             Syntax, 64
               symbols, 65




                                         121








                                          T


             Telephone numbers, 100
             Text to Speech
               Conversion, 17
             Text tuning
               example of, 97
             Tone Generation, 20, 41
             TSR.
               Communicating with, 24
                  Function Codes, 24
                  Voie Control Command Set, 24
                                          U


             User dictionary, 60
               loading, 60
                                          V


             Verb Phrase, 70
             Voice
               commands
                  syntax of, 95
               defining, 83
               parameters, 83
                  assertiveness, 91
                  average pitch, 91
                  baseline fall, 89
                  breathiness, 87
                  gains, 92, 93
                     cascade vocal tract, 93
                     sound source, 93
                  hat rise, 90
                  head size, 84
                  higher formants, 86
                  laryngealization, 88
                  lax breathiness, 87
                  loudness, 93
                  Nopen fixed, 88
                  overloads, 92
                  pitch range, 91
                  quickness, 91
                  quicknessness, 91
                  richness, 88
                  save voice, 94
                  savevoice, 94
                  select voice, 94
                  sex, 84
                  smoothness, 87
                  stress rise, 90
               standard, 83



                                         122








             Voice Definition, 29
             Voices
               characteristics of, 80
             Volume Control
               Settable, 20
             Volume Selection, 43
                                          W


             waveform, 17














































                                         123
































































                                         124
